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1.  Introduction 

The  National  Software  Works  (NSw)  is  a  significant  ne»  stej) 
in  the  deve  lojxnent  of  distributed  processing  systems  ana  computer 
networks.  NSw  i s  an  amoitious  project  to  link  a  set  of 
geographically  distributed  and  diverse  hosts  with  an  operating  systes 
which  appears  as  a  single  entity  to  a  prospective  user. 

The  National  Software  Works  is  being  developed  in  response  to 
a  growing  concern  over  the  high  cost  of  software.  The  Air  force  has 
estimated  that  in  lYYc  it  spent  between  $1  billion  and  >1.‘j  billion 
on  software,  about  three  times  the  annual  expenditure  on  computer 
hardware.  The  Air  force  has  further  estimated  that  by  software 

expenditures  will  be  over  V04  of  total  computer  system  costs. 

Since  the  early  days  of  computing,  in  fact,  the  cost  and 
complexity  of  developing  and  maintaining  software  have  been 
substantial  obstacles  to  the  efficient  and  etfective  use  of 
computers.  To  breach  this  barrier,  both  industry  and  government  ttave 
committed  vast  resources  for  the  development  ol'  tools  --  automated 
aids  for  the  implementors  of  software  ana  the  ituinagers  of  software 
implementation  projects.  These  tools  include  compilers,  editors, 
debuggers,  design  systems,  test  management  tools,  language  analyzers, 
etc. 

The  difficulty  is  not  the  existence  of  suitable  tools  for  a 
given  programming  task;  it  is  the  availability  of  the  tools.  Ihe 
notion  of  software  portability,  often  proposed  as  the  solution  for 
the  problem  of  providing,  programming  tools  in  some  en vir onuient ,  has 
proven  to  be  a  will-o'-the-wisp  which  the  industry  has  vainly  pursued 
for  the  past  twenty  years. 

The  success  of  the  Arpanet  in  providing  programmers 
economical  access  to  geographically  dispersed  computers  provided  the 
foundation  on  which  the  NSW  concept  was  built.  Instead  of  moving  the 
software  from  host  to  host,  let  the  programmer  (and  manager)  use  e.^ch 
software  tool  on  whatever  host  it  already  occupies.  To  take  a 
specific  example,  the  Navy  requires  a  programming  support  environment 
for  the  UYK-tO  minicomputer.  There  currently  exist  cross-assemblers 
and  compilers  for  the  UYK-^0  on  lliM  jbO  hardware.  On  TtNtX  there  is 
a  UYK-20  emulator  and  debugger.  MbLTlCS  has  the  WtUX  editor.  All 
three  of  these  host  computers  are  connected  by  the  Arpanet.  Solution 
--  let  the  programmer  use  these  existing  tools  to  uevclop  cYn-I^O 
software. 
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That  solution  sounds  pl<iuslbl«?,  Init  it  i|(nor»s  some  serious 
practical  considerations. 

o  You  need  an  account  on  each  host.  Tins  Involves  the 
allocation  of  funding,  drawing  up  contracts,  etc. 

o  The  operating  systeir  on  each  host  is  different,  so  you  must 
learn  different  login  procedures,  command  languages, 
interrupt  characters,  file  naming  conventions,  etc.  Further 
you  must  not  confuse  each  system's  conventions  as  you  move 
from  tool  to  tool. 

o  Files  output  from  one  tool  (soy  UFDX  on  ML'LTICS)  are  to  be 
input  to  another  tool  (say  CMS?N  on  IbM  jbO).  This  involves 
at  least  network  transmission  and  usually  file  reformatting. 
To  appreciate  the  magnitude  of  this  problem  one  should  try 
to  use  FTP  (Arpanet  File  Transfer  Protocol)  to  move  a  (jtdX 
output  file  --  a  se()uential  file  of  9  bit  ASCII  characters 
in  .ib  bit  words  --  to  an  IbM  jOO  to  be  a  CMSc’M  input  file  -- 
a  blocked  file  of  60  EKCDIC  character  records  in  bit 
words. 

These  and  similar  problems  will  be  familiar  to  anyone  who  has  used 
several  different  systems. 

The  purpose  of  NSW  is  to  make  this  solution  (of  providing 
programmers  access  to  tools  on  different  hosts)  a  practical  reality. 

The  NSW  user  should  not  have  to  know  about  OS/360,  TtNtX,  and  MULTICS 
with  their  differing  file  systems,  login  procedures,  system  commands, 
etc.;  knowledge  of  how  to  use  the  individual  tools  which  are  needed 
for  the  Job  should  suffice.  He  should  not  have  to  worry  about 
reformatting  and  moving  files  from  a  3b0  to  a  TtNEX;  file 
transmission  should  be  completely  transparent.  The  user  should  not 
have  to  worry  about  obtaining  accounts  on  many  different  machines, 
but  instead  should  have  a  single  NSW  account. 

Thus,  the  National  Software  Works  is  to  provide  programmers 

with  a 

o  Unified  tool  kit  -  distributed  over  many  hosts,  and  a 

o  Single  monitor  with 

.  uniform  command  language, 

.  global  file  system, 

.  single  access  control,  accounting,  and  auditing  mechanism. 
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2.  History  of  NSW  Project 
<?.  1  NSW  Goals 

As  oripinally  conceived,  NSW  was  to  provide  the 
above-described  facility  in  the  context  of  certain  specific  external 
Roals.  The  first  such  goal  was  large  scale.  Conteiripor ary  operating 
systems  support  tens  of  concurrent  users.  NSW  was  to  support  many 
more  users,  possibly  as  many  as  one  thousand.  The  catalogue  alone  of 
the  file  system  for  that  many  users  could  easily  fill  a  3330  disk 
pack.  The  table  space  required  for  keeping  track  of  one  thousand 
users  and  the  software  tools  that  they  are  using  could  easily  cxceec 
the  nemory  of  a  mediut  size  Itut.A. 

The  second  goal  was  riigti  reliability.  If  there  are  one 
thousand  online  users,  then  a  two  hour  system  failure  costs  one 
man-year  of  work.  The  National  Software  Works  --  particularly  its 
monitor  and  file  system  --  must  degrade  gracefully.  failure  of  a 
single  component  --  e.g.,  a  TtNEX  system  on  which  tools  are  running 
--  must  only  reduce  system  capacity,  not  destroy  it.  further,  only 
those  users  actually  using  a  failed  component  should  be  affected  by 
its  f ai lure. 

The  third  goal  was  support  of  project  management.  NGw  was  to 
provide  managers  of  software  projects  with  a  collection  of  programs, 
called  management  tools,  which  they  can  use  to  monitor  and  control 
project  activities.  The  underlying  assumption  here  is  that  a 
manager's  ability  to  insure  that  eacn  programmer's  efforts  contribute 
most  effectively  to  overall  project  goals  can  be  greatly  enhanced  by 
automating  routine  management  tasks.  furthermore,  it  is  assuiueu  tnat 
a  good  environment  for  this  autorr.ation  is  the  system  which  supports 
the  project  programming  activities  because  it  represents  an  effective 
point  for  monitoring  and  controlling  those  activities. 

The  fourth  goal  of  uSW  was  practicality.  NSW  was  not  to  be 
a  "blue  sky"  system,  whose  implementation  required  unrealistic 
assumptions  about  its  environment.  In  particular,  practicality  meant 

o  Minimum  modifications  to  existing  operating  systemis  on 
Arpanet  hosts.  Minimum  was,  in  fact,  to  be  construed  as 
none.  It  was  possible  to  add  privileged  (i.e.,  non-user) 
code  to  the  existing  systems,  but  the  solution  to  the 
problem  should  not  depend  on  rewriting  the  kernel  of  any 
existing  operating  system. 

o  Minimum  modifications  to  existing  tools.  Here,  minimum  no 
longer  meant  none.  It  was  possible  to  require  some  change 
to  a  tool  as  part  of  the  process  of  NSw  installation,  but 
such  changes  should  be  small  scale  and  contained. 

o  Maximum  generality.  Any  solution  which  permits  the  easy 
installation  of  existing  tools  must  also  allow  the  easy 
construction  and  installation  of  new  tools. 


7 


o  No  ritperlin(>ntdl  hdrOwdre.  Itiis  requirement  me^nt  that  new 
hardware-or iented-approaches  to  reliability  -  e.(i., 

PLUKIUUS  -  cannot  be  used.  The  NS«  monitor  and  tile  system 
are  to  run  on  already  available  Arpanet  hosts. 

NSW  Architecture 

In  this  section,  we  summarize  the  NSW  design,  indicating  what 
et feet  the  NSw  goals  had  on  the  system  architecture.  We  can  factor 
the  NSw  problem  into  two  parts: 

o  Tile  development  and  implementation  of  methodology  for 
excising  tools  from  llieir  current  environments  and 
interfacing  them  with  tiie  new  NSW  monitor. 

o  The  design  and  construction  of  a  unified  monitor  and  file 
system  for  the  Arpanet. 

In  the  next  two  subsections  we  examine  each  of  these  problems  in  turn 
and  describe  the  components  of  NSW  wiiich  provide  solutions  to  the 
technical  difficulties  involved  with  eacli  part. 

2.2.  1  Tool  Kit 

We  first  have  the  task  of  excising  tools  from  their  current 
operating  environment  and  embedding  them  in  the  new  one.  In  the 
context  of  the  goals  of  NSw,  we  will  discuss  the  technical  issues 
which  must  be  solved  in  order  to  provide  the  requisite  tool 
installation  methodology. 

By  its  very  definition,  NSW  is  a  distributed  system.  Tool 
processes  run  on  different  Arpanet  hosts.  The  monitor  must  run  on  at 
least  one  Arpanet  host.  There  must  be  some  form  of  inter-iiost 
inter-process  communication.  There  are  low  level  Arpanet  protocols 
for  moving  bits  from  host  to  host,  and  there  are  also  several  higher 
level  protocols  for  moving  files  and  for  terminal  communication.  None 
of  these  protocols,  however,  is  oriented  toward  the  kind  of 
inter-process  communication  which  Ni>w  requires.  Moreover,  even 
though  NSW  is  being  implemented  on  the  Arpanet,  we  want  to  keep  it  as 
independent  as  possible  of  the  underlying  milieu.  Network  technology 
IS  evolving,  and  we  wish  to  be  able  to  realize  the  NSW  architecture 
on  tomorrows's  network's  as  well.  Hence,  the  first  technical 
problem  to  be  solved  is  the  definition  and  implementation  ol‘  an 
appropriate  inter-host  inter-process  communication  protocol.  The 
protocol  developed  for  NSw  is  called  MSij. 


Th«  user  of  a  tool  has  a  variety  of  irechanisns  for  communicating 
with  the  tool.  The  user's  terminal  must  be  interfaced  to  the  system 
and  Its  peculiarities  handled  --  for  example,  the  rigiit  amount  of 
padding  added  after  a  carriage  return.  Control  characters  which 
happen  to  be  meaningful  to  the  local  host  must  be  intercepted  before 
they  reach  the  local  executive.  In  order  to  allow  uniform  access  to 
all  the  tools  in  NSW,  running  on  many  different  machines,  we  must 
define  a  standard  set  of  control  functions  and  implement  a  system 
component  which  interfaces  the  user  to  every  tool.  The  problem  of 
standardizing  control  functions  and  insulating  the  user  from  the 
vagaries  of  the  different  operating  systems  is  liandlec  by  an  low 
component  called  the  front  Lno. 

A  tool  running  on  some  machine  makes  system  calls  requesting 
resources  --  primarily  file  access.  Since  access  to  NSi«  system 
resources  is  to  be  controlled,  accounted  for,  and  audited  by  the  USW 
monitor,  such  requests  must  be  diverted  from  the  local  system  and 
instead  referred  to  that  monitor.  In  addition,  the  tool  expects  to 
have  a  communication  link  with  the  user,  and  this  link  in  NSW  is  via 
MSG  to  the  Front  End.  So,  without  modifying  the  operating  system,  we 
must  divert  the  tool's  communications  with  the  user  and  the  tool's 
requests  for  local  resources.  The  NSW  component  which  solves  this 
problem  is  called  the  Foreman. 

Finally,  we  expect  that  the  output  of  one  tool  will  be  used 
as  input  to  another  tool.  Unfortunately,  if  the  first  tool  is  a 
MULTICS  editor  and  the  second  an  IBM  3b0  compiler,  this  operation 
involves  character  translation  (ASCII  to  EBCDIC),  file  reformatting 
(sequential  file  to  blocked,  recorded  file),  and  file  movement 
(across  the  Arpanet).  To  handle  such  file  transformations  and 
movements  there  is  an  NSW  component  called  the  File  Package. 

It  is  worth  noting  at  this  point  that  all  of  the  above 
components  are  distributed.  Every  host  in  NSW  has  an  MSG  server 
process.  Every  site  to  which  a  user  is  connected  has  a  Front  End. 

Every  tool  bearing  host  has  a  Foreman.  Every  host  on  which  NSW  files 
are  stored  has  a  File  Package.  It  is  also  worth  noting  that 
implementation  details  of  these  components  vary  from  host  to  host. 

A  MULTICS  Foreman  will  be  vastly  different  from  an  IBM  JbO  Foreman. 
Functional  specifications  for  these  components  are  fixed  iroughout 
NSW,  but  implementation  and  optimization  decisions  are  left  free. 

Before  proceeding  to  the  NSW  monitor,  let  us  summarize  the 
technical  problems  and  the  resulting  components  which  provide  the 
unified  tool  kit  methodology. 

o  Inter-host  inter-process  communication  MSG 

o  User  interface  Front  End 

o  Diversion  of  communication  with  local 

operating  system  Foreman 

o  File  transformation  and  movement  File  Package 


..<*.2  NSw  Monitor  and  Kile  System 

The  design  of  the  NSl*  Monitor  -  called  the  Works  Manager  - 
was  probably  more  affected  than  any  other  component  by  the  goals  of 
NSw.  Functionally  it  is  not  different  from  any  other  conventional 
access-checking,  resource-granting  monitor.  Structurally,  however, 
it  IS  significantly  different. 

The  goals  of  providing  both  large  scale  and  reliability  on 
conventional  hardware  led  to  the  approach  of  distributing  the  Works 
Manager  and  file  system.  If  there  are  many  instances  of  the  hSW 
monitor  on  many  different  hosts,  then  failure  of  a  host  is  not 
catastrophic.  Unfortunately,  distribution  runs  counter  to  the 
problem-required  logical  unity  of  the  monitor  and  file  system.  If  a 
user  inserts  a  file  into  the  file  system  using  one  tool  and  one 
instance  of  the  file  system,  and  then  requests  the  same  file  using  a 
different  tool  and  a  different  instance  of  the  file  system,  the  two 
instances  of  the  file  system  must  share  a  common  file  catalogue  for 
the  system  to  behave  properly.  Similarly,  all  instances  of  the 
monitor  must  share  an  access  rights  data  base  for  proper  validation 
of  user  requests  to  run  tools. 

A  major  technical  problem  in  designing  the  Works  Manager  was 
that  of  creating  synchronized  duplicate  data  bases.  The  process 
structure  of  the  Works  Manager  was  designed  so  that  such  synchroni¬ 
zation  could  be  accomplished.  Further,  that  process  structure  can 
handle  the  intrinsic  distribution  of  NSW.  Communication  between  the 
Works  Manager  and  other  NSW  components  -  Front  tnd.  Foreman,  and  File 
Package  -  is  via  MSG,  a  relatively  slow  link. 

2.3  Phases  of  NSW  Development 

The  design  and  implementation  of  the  National  Software  Works 
has  proceeded  in  four  slightly  overlapping  phases 

o  Structural  design  and  feasibility  demonstration 

o  Detailed  component  design 

o  Prototype  implementation 

o  Reliability  and  performance  improvement 

In  the  following  subsections  we  describe  these  phases  in  more  detail. 


2.3.1  Structural  Desipn  and  Feasibility  Demonstration 

The  first  phase  of  NSW  deveiopnient  began  in  July,  and 

concluded  in  November,  1975.  During  this  period,  the  basic 
architecture  of  NSW  (described  in  Section  2.2)  was  established. 

Further,  relatively  ad  hoc  implementations  of  major  components  were 
made.  These  components  were  Integrated  into  a  system  which  was 
demonstrated  to  AHFA  and  Air  Force  personnel  at  tlunter  AFB  in 
November,  1975.  This  demonstration  exhibited  ( functionally )  various 
system  functions,  the  user  of  batch  tools  on  the  IBM  jbO  and 
Burroughs  Bi*700,  the  use  of  interactive  tools  on  TKNtX,  transparent 
file  motion  and  translation,  and  a  primitive  set  of  project 
management  functions. 

This  demonstration  confirmed  that  the  expected  NSW  facilities 
could  be  implemented  and  that  transparent  use  of  a  distributed  tool 
kit  was  feasible.  The  NSW  System,  however,  was  inefficient  and 
fragile.  Further,  many  of  the  ad  hoc  implementations  had  design 
weaknesses  which  limited  their  general  application  to  a  sufficiently 
broad  range  of  hosts  and  capabilities.  For  these  reasons,  an  effort 
was  begun  to  produce  adequate  component  designs. 

2.3.2  Detailed  Component  Design 

This  second  phase  of  NSW  development  was  begun  in  June,  1975 
with  the  initial  MSG  desipn  document.  Specifications  were  developed 
for  Tool  Bearing  Host  components  -  MSG,  Foreman,  and  File  Package. 

All  of  these  specification  documents  were  completed  by  March,  1976. 
(They  have  all  been  revised  since  then,  but  the  original  specifications 
are  still  substantially  correct. ) 

During  the  same  period,  the  external  specification  of  the 
Works  Manager  was  also  made.  Again,  although  this  specification  has 
subsequently  been  revised,  it  is  still  substantially  correct.  The 
remaining  portions  of  the  core  of  NSW  -  i.e.,  the  batch  tool  facility; 
Works  Manager  Operator,  Interactive  batch  Specifier,  and  Interface 
Protocol  -  were  designed  during  phase  one,  and  those  designs  were 
retained  until  phase  four  (see  below). 

The  remaining  major  NSW  component,  the  Front  tnd  was  the 
subject  of  several  design  efforts.  Three  incomplete  specification 
documents  were  produced  but  none  of  these  was  wholly  satisfactory. 
Nevertheless,  sufficient  design  to  allow  implementation  of  a 
functionally  correct  Front  tnd  was  accompl 1  shed.  Completion  of  a 
general  specification  for  the  Front  tnd  is  one  of  the  tasks  remaining 
to  be  accomplished. 


i!.3.3  Prototype  Implementation 

As  sped rication  documents  were  completed,  various 
contractors  began  Implementation  of  the  NSW  components  on  the  initial 
set  of  hosts  -  TtNtX,  MULTICS,  and  IBM  360.  Tiiese  efforts  commenced 
in  January,  1976.  Implementation  on  IbNEX  proceeded  mure  quickly 
than  the  efforts  on  the  other  hosts  -  primarily  because  the  NSW 
system  designers  were  also  the  TtNEX  Implementors.  by  October,  1976 
Pi  ototype  implementations  which  conformed  to  the  published 
specifications  had  been  made  for  all  TtUtX  Toll  components.  In 
addition,  all  components  of  the  core  system  were  avallnlle  on  Tthr.A. 

Implementation  of  I'lUl  components  on  MULTICS  and  IBM  JbO 
proceeded  more  slowly;  however,  initial  implementations  of  MSU 
components  on  both  of  these  hosts  were  completed  by  the  end  of  197b. 

By  November,  1976  sufficient  progress  had  been  made  on  implementation 
of  a  Kile  Package  and  foreman  on  MULTICS  that  it  was  possible  to 
demonstrate  an  Interactive  tool  running  on  MULTICS.  Progress  on 
implementation  of  360  TBH  components  reached  a  similar  position  in 
September,  1977. 

Also  during  this  phase,  a  TLNKX  Front  tnd  which  functionally 
supported  the  Works  Manager  and  Foreman  according  to  the  appropriate 
specifications  was  Implemented. 

An  NSW  system  containing  prototype  implementations  according 
to  the  specifications  of  the  core  system,  TtNEX  TBH  components,  TENEX 
Front  End,  batch  IBM  360  tools,  as  well  as  a  rudimentary  MULTICS 
interactive  tool  was  demonstrated  to  Air  Force  and  AKPA  personnel  in 
November,  1976.  At  the  same  time,  a  demonstration  of  MSG  components 
on  all  three  hosts  was  also  given. 

Reliability  and  Performance  Improvement 

Even  though  implementation  of  components  on  MULTICS  and  IBM  360 
was  lagging,  implementation  of  the  core  system,  TtNEX  TBH  components, 
and  TENEX  Front  End  had  proceeded  to  the  point  that  the  Issues  of 
reliability  and  performance  assumed  major  importance.  The  system 
exhibited  sufficient  functional  capability  that  it  could  clearly 
support  use  by  programmers  if  it  were  sufficiently  robust  and 
responsive. 

The  first  task  attacked  was  to  provide  robustness.  Work  had 
begun  on  a  full-scale  NSW  reliability  plan  in  1975.  The  detailed 
plan  was  released  in  January,  1977.  Since  it  was  clear  that 
implementation  of  the  full  plan  was  a  major  undertaking,  a  lesa 
ambitious  interim  reliability  plan  which  ensured  against  loss  of  a 
user's  files  was  begun  in  mid-1976.  This  plan  was  also  released  in 
January,  1977.  By  June,  1977  the  core  system,  TtNEX  Foreman,  and 
TENEX  Front  End  had  been  modified  to  Incorporate  the  features  of  that 
interim  plan.  In  addition,  both  the  MULTICS  and  IBM  360  Foreman 


(only  partially  implemented)  were  altered  to  conform  externally  to 
the  scenarios  specified  by  the  interim  reliability  plan.  A  systeni 
exhibiting  the  new  scenarios  was  released  for  use  in  June,  W?/. 

Performance  of  NSw  had  been  slow  from  the  initial 
implementation.  The  reasons  for  slow  response  were  many: 

o  interaction  between  components  was  by  a  thin  wire  (h:jO  and 
the  Arpanet). 

o  NSW  components  (which  constitute  an  operating  system) 

nevertheless  were  executed  as  user  processes  under  the  local 
host  operating  system. 

o  Component  implementation  had  been  oriented  towards  ease  ol‘ 
debugging  and  other  concerns  of  prototype  systems  rather 
than  towards  the  performance  expected  of  a  production 
system. 

In  1977,  efforts  to  Improve  NSW  performance  were  (jcgun. 

The  first  effort  was  the  development  of  a  performance 
measuring  package  for  TtNEx  MSG.  hesults  of  the  first  set  of 
measurements  were  reported  in  April,  1977.  Some  performance 
improvements  were  suggested  by  the  initial  measurements,  but  the  most 
obvious  suggestion  was  that  more  sophisticated  measuring  packages 
were  needed.  Several  such  packages  were  begut>  to  perform  various 
kinds  of  measurements  on  TtNEX  components.  All  ol'  these  packages 
were  complete  by  February,  197S.  by  May,  '976,  all  ThNtX  components 
had  been  instrumented  and  measurements  of  page  use,  CPU  time,  elapsed 
time,  use  of  JSYS  (TENEX  system  commands),  etc.  hau  been  taken  under 
a  variety  of  system  load  conditions  and  on  several  different  ItNtX 
hosts.  Efforts  are  currently  under  way  to  implen.ent  the  performance 
improvements  suggested  by  these  measurements.  Performance 
improvements  have  already  been  made  to  several  components.  hesults 
of  these  improvements  are  described  in  section  3*2  below. 

Concurrent  with  the  effort  to  improve  NSW  reliability  and 
performance,  an  effort  to  make  NSW  a  more  packaged  product  were 
begun.  Kegression  tests  for  the  externally  available  NSW  user  system 
were  developed  and  applied  to  each  system  release.  A  user's  manual 
for  the  system  was  published.  Documentation  of  the  core  system  was 
produced.  Finally,  a  draft  configuration  management  plan  was 
developed. 

Phase  four  of  NSW  development  is  still  continuing.  Etl'orts 
to  improve  performance  of  TENEX  components  are  substantially 
complete.  Certain  features  of  the  full  scale  reliability  plan  have 
also  been  implemented,  and  phase  four  should  be  complete  by  mid  1979. 
Phase  five,  development  of  a  production  NSW  system,  is  underway.  The 
efforts  proposed  for  phase  five  are  described  in  section  ^  below. 


3.  Current  Status 
3.  1  Overview 

Tlie  NSW  system  currently  available  to  users  was  released  in 
November,  1978.  It  has  the  followinK  characteristics: 

o  Twenty  interactive  TENfX  tools;  five  of  these  tools  are 

installed  in  TUHS^O,  but  some  problems  remain  as  compared  to 
the  TENEX  installations. 

o  Ten  interactive  MULTICS  tools,  some  of  which  are  still  being 
tested. 

o  One  interactive  IbM  360  tool,  and  nine  IbH  jbO  batch  tools. 

o  basic  set  of  system  commands. 

o  User  documentation  and  support. 

o  Kudimentary  set  of  management  procedures. 

o  Improved  operability. 

o  Configuration  of  system  includes  following  hosts: 

ISIC 

SRI-KA 

CCN 

RADC-T0PS20 

HAbC-MULTICS 

functionally,  the  current  NSW  system  is  minimally  adequate.  It  has 
a  reasonable  collection  of  tools,  but  many  of  these  tools  have  not 
been  adequately  tested.  The  minimal  set  of  user  commands  is 
available  and  tested,  but  many  needed  user  features  are  lacking  -  e.g. 
command  macros,  in  file  commands,  I/O  devices,  Arpanet  mail,  etc. 
Performance  has  been  improved  significantly.  The  documentation  of 
system  components  has  been  Improved,  but  much  needs  to  be  done. 

TENEX  and  T0PS20  are  available  as  Works  Manager  or  Tool  bearing  Hosts 
according  to  specification,  but  TOPS20  tool  encapsulation  is 
currently  less  satisfactory  than  TENEX.  Additional  encapsulated 
tools  can  be  installed  in  either  environment  to  increase  NSW 
capacity.  Batch  tools  are  available  on  the  CCN  IBM  360/91,  and  more 
can  be  installed  as  needed.  A  major  overhaul  of  the  entire  batch 
system  has  made  it  more  consistent  with  the  rest  of  NSW,  more 
flexible,  powerful,  operable  and  resilient.  The  IBM  jbO  Foreman 
implements  only  one  Interactive  tool,  and  a  minimal  set  of  specified 
features.  The  MULTICS  implementation  has  been  improved  enough  to  be 
included  in  the  user  system  with  an  expanded  tool  set,  alt  ough 
problems  persist  -  particularly  in  the  Foreman  implementation. 

The  current  status  of  the  individual  component  implementations 
is  presented  in  section  3i  and  planned  improvements  to  the  system 
are  presented  in  section  a. 
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3.2  Components 

In  the  following  subsections  we  give  a  description  oC  the 
current  status  of  each  component. 

3.2.1  Core  System  Components 

The  core  system  components  -  Works  Manager,  Ctieckpointer,  and 
Works  Manager  Operator  >  are  substantially  complete.  The  Works 
Manager  has  been  the  object  of  an  extensive  and  successful  effort  to 
improve  its  performance.  The  Checkpointer  has  had  its  functionality 
enhanced,  and  been  made  more  robust.  The  Works  Manager  Operator  has 
been  substantially  rewritten  to  interface  to  the  batch  Job  Package, 
and  to  conform  to  the  coding  standards  imposed  on  the  Works  Manager. 

3.2. 1.1  Works  Manager 

At  present,  the  Works  Manager  consists  of  a  number  of  identical 
concurrent  processes  which  implement  the  Works  Manager  procedures. 

All  such  processes  share  two  common  data  bases,  the  Works  Manager  Table 
data  base  and  the  NSW  Kile  Catalogue.  In  addition  to  these  processes 
there  is  a  separate  process,  the  Checkpointer,  which  makes  periodic 
backup  copies  of  the  data  bases. 

The  Works  Manager  supports  36  different  Works  Manager  procedure 
calls,  which  are  available  to  other  NSW  processes.  These  procedures 
are  described  in  the  Works  Manager  System/Subsystem  Specification  and 
the  Works  Manager  Program  Maintenance  Manual. 

A  substantial  effort  was  invested  in  implementing  the  scenarios 
described  in  the  "Interim  NSW  Reliability  Plan"  (CA-V 70 1-2 1 1 1 ) .  These 
scenarios  are  as  close  as  possible  to  the  final  NSW  design  which  is 
described  in  "NSW  Reliability  Plan"  (CA-7 70 1- 1 41 1 ) .  The  goal  of 
these  scenarios  was  to  guarantee  a  user  that  a  system  malfunction  -- 
other  than  catastrophic  disk  failure  --  would  cause  few,  if  any,  of 
her/her  files  to  be  lost.  This  guarantee  includes  files  stored  in  the 
NSW  file  system  as  well  as  closed  local  files  in  a  tool's  workspace. 

It  was  not  a  goal  to  provide  continuity  of  service  in  the  face  of 
individual  component  failure,  nor  was  it  a  goal  to  eliminate 
long  (possibly  endless)  waits  by  the  user  in  the  event  of  message 
delays  or  component  failure  (these  desirable  goals  would  be  met 
by  implementing  the  complete  reliability  plan). 

In  order  to  guarantee  that  NSW  file  system  files  not  be  lost 
(except  under  rare  circumstances)  it  was  necessary  to  preserve  the 
NSW  file  catalogue.  It  was  presumed  that  these  flics  themselves 
are  preserved  by  some  mechanism  on  the  file  bearing  host.  Periodically 
(currently  at  approximate  twenty  minute  intervals)  the  WM  file 
catalogue  is  locked,  the  entire  file  catalogue  is  copied  onto  disk, 
and  then  the  lock  is  released.  The  WM  also  maintains  a  data  base  of 
active  users,  active  tools,  etc.,  which  is  also  copied  onto  disk 
(using  the  same  mechanism  described  above  for  the  catalogue).  The 
Checkpointer,  a  new  NSW  component,  was  designed  and  implemented  to 
fulfill  this  function. 


The  twenty  minute  interval  introduced  a  window  during  which  a 
file  transaction  may  be  lost  if  the  WM  host  should  crash.  This 
twenty  minute  interval  is  sufficient  with  respect  to  NSW  txec  commands. 
However,  a  tool  might  wait  until  termination  to  deliver  any  files; 
in  this  case,  many  hours  of  work  could  be  lost.  In  order  to  avoid 
this  problem,  a  mechanism  was  developed  so  that  a  foreman  could  ensure 
the  preservation  of  the  local  tool  workspace  (LND)  in  the  event  of 
either  local  host  crash  or  the  failure  of  other  NSW  components.  The  LND 
contains  any  files  being  delivered  by  the  foreman  on  behalf  of  the  tool. 

The  mechanism  developed  ensured  that  the  LND  is  preserved  until 
after  a  file  catalogue  containing  references  to  delivered  files 
has  been  checkpointed.  The  LND  is  only  (intentionally)  erased  after 
tool  termination.  Whenever  a  tool  terminates  normally,  an  additional 
message  (KM-GUAHANTEE)  is  sent  by  the  Checkpointer  (the  process 
performing  the  file  catalogue  checkpoint)  to  every  Foreman  instance 
which  terminated  since  the  last  checkpoint.  Eacii  Foreman  instance 
sets  a  timer  and  if  the  FM-GUARANTEE  message  is  not  received  when  the 
timer  goes  off,  the  Foreman  saves  the  LND. 

The  requirement  for  the  Foreman  is  that  it  must  be  able  to 
maintain  the  LND  is  such  a  way  that  it  is  preserved  over  Foreman 
host  crashes.  The  Foreman  must  be  able  to  explicitly  invoke  this 
save-the-LND  mechanism.  This  allows  the  Foreman  to  explicitly  preserve 
the  tool's  workspace  should  at\y  difficulties  arise  during  some  scenario. 

The  AUTOLOGOUT  scenario  is  initiated  by  a  break  in  the 
connection  between  the  user's  terminal  and  the  Front  End.  All 
running  tools  are  forced  to  stop  and  initiate  the  save-tlie-LND  mechanism 
described  above. 

A  mechanism  was  also  implemented  which  allows  the  user  to  have 
(some  of)  the  saved  files  delivered  to  the  NSW  file  system.  This 
mechanism  is  provided  by  the  LNDSAVED  and  HERUNTOOL  sceanarios.  Once 
a  Foreman  has  performed  the  save-the-LND  mechanism,  it  informs  the 
Works  Manager.  The  'Works  Manager  maintains  a  record  of  such  saved  LNDs  in 
each  user's  node  record.  A  message  will  be  sent  to  the  user  at  each 
subsequent  login  until  the  user  causes  its  deletion  by  using  the  RERUN 
command  (which  invokes  the  RERUNTOOL  scenario).  The  user  will  receive 
messages  about  the  saved  LND  until  the  user  explicitly  saves  the  files 
(TERMINATE  subcommand)  or  deletes  ther,  (ABORT)  subcommand).  Currently, 
these  are  the  only  two  options  of  RERUN  which  are  implemented;  it  has 
been  proposed  that  RERUN  be  expanded  to  allow  the  user  to  run  a  new 
instance  of  the  tool  in  the  saved  LND. 

A  major  change  was  the  introduction  of  the  Works  Manager  Table 
Facility  as  a  performance  enhancement.  See  Appendix  A  for  details. 
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The  Works  Manager,  which  consists  of  approximately  ib.YK 
lines  of  ilCPL  code,  is  structured  into  a  number  of  layers.  At  the 
top  level,  WMMAIN  waits  for  a  procedure  call  message  from  another  USw 
process,  does  initial  decoding  and  validity  checking  of  any  such 
message,  then  dispatches  the  message  to  the  proper  routine.  The 
Works  Manager  Routines,  WMBTNS,  implement  the  36  Works  Manager 
Procedures.  At  their  disposal  are  a  number  of  lower-level  utility 
packages  and  subsystems.  The  Works  Manager  Table  Package,  WMTPKU, 
handles  all  interactions  with  Works  Manager  tables.  It  serves  as  an 
interface  to  the  Information  Retrieval  System,  IRKi<TV,  which  manages 
the  NSW  File  Catalogue  and  the  Works  Manager  Tables.  All  NSw 
processes  written  in  BCPL  have  available  NSUPKG  and  bCPPKG.  NSdPKG 
contains  a  number  of  facilities  to  handle  MSG  messages,  create  and 
record  NSW  fault  descriptions,  etc.  bCPPiCG  provides  basic  utilities 
to  handle  character  strings,  do  searching  and  sorting,  and  so  forth. 

As  with  other  core  system  components  and  the  TtNtX/ 10PSi?0 
File  Package,  the  Works  Manager  is  transportable  between  TtUtX 
and  T0PS20  without  modification.  See  Appendix  C  for  details. 
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3.2. '.2  Checkpointer 

The  Checkpointer  status  mimics  that  of  the  Works  Manager, 
since  it  consists  largely  of  the  entire  Works  Manager  utility 
package,  with  a  relatively  small  upper  layer  of  code  to  implement 
the  specific  Checkpointer  procedures.  Thus,  like  other  core 
system  components,  the  Checkpointer  is  transportable  between 
TtNEX  and  TOPS20  without  modification  (see  Appendix  C).  The 
performance  improvements  realized  by  the  Works  Manager  table  facility 
also  apply  to  some  Checkpointer  procedures. 

The  Checkpointer  has  the  following  characteristics: 

o  Implements  the  KM-GUAKANTEE  call  on  the  Foreman 
required  by  the  Interim  Keliability  Scenarios. 

o  Manages  iNSW  file  deletion.  Files  deleted 

by  the  user  are  actually  deleted  by  the  Checkpointer 
after  a  time  interval,  as  required  by  the  Interim 
Keliability  Flan. 

o  Makes  Checkpoint  files  of  all  Works  Manager  database 
files  at  approximately  twenty  minute  intervals. 

o  Is  robust  and  flexible  to  about  the  same  level  as  the 
works  Manager  itself. 


3.2. 1.3  Works  Manager  Operator 

The  mod i f  i ca t i on/ par t  i a  1  re-lmplementation  of  tlie  Works  Manager 
Operator  (KMO)  to  meet  the  revised  Batch  Job  Package  specification  included 
as  Appendix  B  to  this  report  Is  complete.  Ttie  new  version  of  WHO 
was  relased  as  a  component  of  the  candidate  user  hiJW  system  on 
November  16,  1978.  The  one  WMO  procedure  specified  is  supported, 
this  version  has  been  extensively  tested  with  the  corresponding 
version  of  the  CCN/360  bJP  released  on  the  same  date,  and  there  are  no 
known  outstanding  deficiencies. 

WMO  shares  a  data  base  (the  Job  Queue  File)  with  the 
Interacive  Batch  Specifier  (IBS)  module  in  the  Works  manager.  we 
intend  to  remove  this  shared  access  by  making  all  access  to  this  data 
base  be  via  procedure  calls  on  WMO,  which  will  have  sole  access.  To 
this  end,  direct  access  to  the  data  base  by  the  WH  to  get  a  batch  job 
status  (NSW:  JOB)  has  been  replaced  by  a  call  on  WHO  by  WM  on  the 
(new)  WMO-SHOWJOB  procedure.  Direct  access  to  the  data  base  by  IBS 
will  be  replaced  by  use  of  a  WMO  procedure,  WMO-ENTEHJOB ,  to  be 
specified  and  implemented  in  the  future. 

The  extensive  modifications  to  WMO  t>ave  allowed  us  to  make  its 
programming  style  consistent  with  that  used  in  the  Works  Manager  and 
File  Package.  Its  use  of  the  Works  Manager  utilities  is  also  consistent 
with  the  other  comjjonents,  and  its  logging' and  timeout  behaviour  is 
identical. 
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Some  able  characteristics  of  the  current  »»M0  -  particularly  those 
not  suRRested  by  Appendix  b  -  are  as  follows: 

o  WMO  is  responsible  for  both  processing  the  Job  wueue 

File  and  handling  WMO  procedure  calls.  Tiiese  two  tasks  are 
handled  by  distinct  instances  of  wMO  in  any  given  I>Sm  system. 

(1)  There  is  exactly  one  instance  of  WMO  processing  t!ie  job 
queues.  A  standard  locking  discipline  guarantees 
that  precisely  one  such  instance  exists.  This  instance 
executes  the  job  steps  necessary  to  process  a 
batch  job,  and  initiates  all  procedure  calls 
to  external  processes  (<JM,  UJH,  Ff).  It  tiever  receives 
generically  addressed  MSij  messages. 

(4?)  There  are  zero  or  more  instances  of  w'MO  which  receive 
generically  addressed  MSU  messages,  and  hanole  all 
currently  defined  wMO  procedures.  These  instances 
never  execute  job  steps  or  initiate  external  procedure 
calls,  thus,  these  ipstance(s)  provide  external  access 
to  the  data  base. 

o  A  primitive  retry  mechanism  exists.  *.M0  will  retry  an 
external  procedure  call  indefinitely  when  it  fails  due  to 
network  or  remote  host  crash.  It  will  retry  a  failed 
external  procedure  call  a  maximum  of  three  times  if  the 
failure  is  due  to  resource  problers,  e.g.  no  disk  space. 

o  iltatus  reports  generated  by  WMO  lor  display  by  w'F.  (NS'«:  JOii) 
have  been  made  more  informative;  all  information  supplied 
by  bJP  is  reported. 

o  The  maximum  number  of  jobs  in  the  job  queue  file  is  currently 
bU,  This  may  be  increased  when  needed,  but  requires 
re-compilation  and  reloading  of  WKO. 

o  The  WMO  cycle  number  may  be  set  manually  by  the  V>HO  utility 
(WMOUTL),  but  does  not  automatically  increment  with  each 
cold  start.  "Cold  Start"  in  this  version  occurs  only  when  a  new 
new  job  queue  file  is  created. 
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3.^.2  TENEX/r0Hii20  Tbtt  Components 

The  TtNEX/TOPS20  TBH  is  the  most  advanced  of  the  three  TUHs.  All 
components  (MSG,  foreman,  and  file  Package)  are  substantially  complete 
and  tested.  All  components  are  transportable  between  TtNEX  and  TOPS20. 

3.2.2. 1  MSG 

The  MSG  specification  was  produced  in  January,  197b.  It  was 
revised  in  December,  197f>  -  primarily  to  resolve  ambiguities  in  the 
earlier  document.  It  was  extended  in  April,  197B  to  allow  for 
support  of  multiple,  concurrent  NSk  systems.  T)ie  TENEX/TOPS-20  MSG 
component  implements  the  revised  and  extended  specification  with  only 
two  exceptions  (which  are  noted  below). 

T)ie  TENEX/ rOPS-20  implementation  of  MSG  is  a  single  executable 
module  which  will  run  under  TENEX,  TOPS-20  Version  101b,  and  TOPS-20 
Release  3.  In  addition  to  the  communication  functions  supported  for 
processes  (and  defined  by  the  MSG-process  interface  peci ficat ion)  the 
TENEX/TOPS-20  implementation  includes  a  powerful  process  monitoring 
and  debugging  facility,  and  comprehensive  performance  monitoring 
software. 

The  TENEX/TOPS-20  implementation  does  not  perform  MSG-MSG 
authentication.  Message  sequenceing  and  stream  marking  are  not 
implemented  (however  the  underlying  software  structure  exists  to 
support  both). 

The  current  implementation  was  extended  to  support  new 
component  initiation  features  required  to  support  T0PS20  TbH  components. 
In  addition,  a  recent  modification  to  MSG  supports  rapid  timeout  of 
attempts  to  contact  remote  hosts  where  an  MSG  is  not  up,  or  which  are 
tliemselves  down.  This  markedly  reduces  the  wait  time  imposed  on  a 
user  who  has  attempted  to  use  an  unavailable  resource. 

The  implementation  has  also  been  modified  to  enhance 
its  performance,  based  on  extensive  performance  measurements  completed 
this  year.  Changes  include  elimination  of  network  connections  for 
local  message  traffic,  data  re-structuring,  reduction  of  calls 
on  expensive  JSXSES,  and  improved  strategies  for  memory  allocation. 
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3. 2. 2. 2  Foreman 

The  current  TtNEX/ rOPS-20  Foreman  (Version  1^21)  ir.pleinecits 
all  scenario  functions  defined  by  tlie  interim.  iJlSw  reliability  plan  in 
its  most  recent  revision  (March  1,  1977).  The  Foreman  only  supports 
tools  which  run  in  encapsulated  mode.  It  does  not  yet  support  the 
direct  use  of  NSW  tunctions  by  any  class  of  tools.  It  currently 
supports  approximately  twenty  TtNtX  and  five  TUPS20  tools  in  this 
encapsulated  mode.  Some  of  these  tools  have  been  .extensive ly  tested 
and  used  within  NSW;  others  have  merely  been  superficially  exerciseu. 

The  latest  release  can  operate  on  both  TthtX  and  TuFS-^0 
nelease  3  conf ipurations.  There  is  a  single  .SAV  file  whicl.  oetects 
at  runtime  the  configuration  type  and  modifies  its  behavior 
accordingly.  This  newest  release  has  now  had  adequate  field  testing 
on  the  TOPb-20  t..achines.  Not  all  TENEX  l.Sw  tools  are  available  on 
TOPS-20  and  those  that  are  have  not  been  testec  to  the  sane  degree  as 
their  TENEX  counter  parts. 

The  current  Foreman  implementation  handles  the  problem,  of 
storing  "saved"  tool  workspaces  through  the  temporary  m.eans  of 
utilizing  the  workspaces  them.selves.  A  perm.anent  facility  to  handle 
workspace  management  is  already  designed  and  implen.entation  is 
pending. 


The  TENEX/TOPS20  Foreman  has  been  extensively  m-odiiied  as  a  result 
of  the  extensive  performance  measurements  made  in  early  1976  and 
reported  in  UBN  report  No.  36^7,  "A  Performance  Investigation  ol'  the 
National  Software  Works  System".  Performance  enhancement  has  been 
currently  lim.ited  to  reducing  resource  consumption  by  tlie  Foreman 
e.g.  by  minimizing  use  of  expenive  JSYSes,  pre-allocating 
workspace  directories,  etc.  Future  work  will  address  alternative 
system,  support  configurations,  and  altered  patterns  of  NSm  cominun  ications. 
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j.2.2.3  File  Package 

The  ltNEX/T0PS20  File  Package  is  now  functionally  complete.  The 
task  of  writing  Intermediate  Language  encode/decode  for  non-TLNEX  binary 
finished  files  is  now  complete,  and  has  been  tested  with  the  CCN/jbO  Pile 
package  for  several  representative  binary  file  types.  The  current 
Kile  package  version  has  the  following,  characteristics. 

o  All  specified  File  Package  procedures  are  implemented 
ano  tested  for  local,  family,  and  non-family  network 
transfers.  Unspecified  procedures  to  support  the  obsolete 
IP  mechanisms  in  WMO  have  been  expunged. 

o  The  Intermediate  Language  (IL)  encode/decode  package  has  been 
re-structured  lor  greater  efficiency  and  maintainability, 
tncode/deoode  has  been  partitioned  into  three  classes  -  text 
files,  sequenced  test  files,  and  binary  files;  there  is  an 
encode  and  a  decode  module  for  each  class,  totalling 
six.  Code  size  has  increased,  but  both  efficiency  and  code 
comprehensibility  have  been  greatly  enhanced.  The  interface 
between  the  (UCPL)  calling  routines  and  the  ( MACH01 0/20) 
service  routines  has  been  simplified.  Implementation  of  binary 
file  encode/decode  is  complete,  and  has  beeri  extensively  tested 
both  against  itself  (i.e.  against  a  remote  IlNEX  simulating 
a  non-TENEX  host),  and  against  the  CCN/3b0  File  Package. 

Ue  have  confirmed  correct  transmission  of  CMSLM  object 
files  from  CCN/3b0  to  TENEX/TOPS20. 

o  Performance  enhancer.ents  have  been  implemented  based  on  the 
results  of  BCN's  performance  investigation  as  reported  in 
13UN  report  No.  "A  Performance  Investigation  of  the 

National  Software  Works  System",  UKAFT  VEKSION,  July  197tJ 
by  Hichard  E.  Sctiantz.  We  have  minim.ized  the  use 
of  expensive  JSYSes,  notably  the  CNLIK  (connect 
to  directory)  JSYS  (average  cost  220  ms  per 
call).  We  have  done  so  by  specifying  that  the  File  Package 
must  be  able  to  create/read/delete  files  in  its  own  filespace 
and  Foreman  workspaces  without  connecting  to  them,  and  letting 
it  stay  connected  to  its  LOGIN  directory.  This  has  had 
no  practical  effect  on  the  operation  of  NSW,  beyond  requiring 
that  these  directories  be  accessible  from  the  system  LOGIN 
directory.  These  enhancements  hae  resulted  in  a  CPU  usage 
reduction  of  up  to  bOt  for  delivery  of  a  file  from  the 
Foremian  workspace. 
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o  The  Kile  Package  is  completely  transportable  between  TLr;£X 
and  TUPSiJO,  requiring  no  modifications  or  patcfies.  The 
simple  transportability  is  based  on  the  use  of  tne  Cilobal 
Tailoring  Kile  for  filespace  name,  logging  information,  anc 
the  use  of  the  JSYb  encapsulation  packages  now  included  in  the 
WorKS  yanager  utilities.  (See  appendix  C). 

o  The  loRpinr  of  messages  sent/received  via  XJG  is  under 
control  of  a  switch  in  the  dlobal  tailoring  Kile  (as  in 
WM,  WMO  and  CUKPTH).  When  logging  is  disabled,  CPU  usage 
for  typical  KP  calls  is  reduced  -  <40^.  Kor  comparison, 

the  KP  retrieval  calls  analyzed  in  bbu  report  Wo. 

"A  Performance  Investigation  of  the  National  Software 
Works  System”,  DRAFT  VSKSIO.N,  July  >978  by  Richard  t.  Schantz,  . 
which  averaged  about  2.9  ms,  can  be  reduced  to  as  low  as 
0.7  ms  with  logging  disabled. 

The  File  Package  is  written  primarily  in  bCPL  (approximately 
6.9K  statements  including  utilities.)  The  IL  encode/decode  package 
is  written  in  Macro-10  and  consists  of  approximately  1.7K  instructions. 
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3.2.3  ItiM  360  TbH  Components 

The  IBM  360  TBH  is  the  second  most  advanced  host.  MSG  and 
the  File  Package  are  substantially  complete.  The  batch  Job  Package  is 
debugged  and  available.  The  weakest  component  is  the  Foreman  which 
implements  only  a  small  subset  of  the  specification. 

A  new  overlay  mechanism  which  supports  overlaying  of  exclusive 
segments  has  been  constructed  and  installed  in  the  File  Package,  Poreman, 
and  Batch  Job  Package.  This  mechanism  was  required  to  allow  these 
components  to  fit  in  available  real  core,  and  to  allow  for  incremental 
increases  in  code  size. 

3.2.3. 1  MSG 

The  IBM  360  MSG  component  implements  substantially  all  of  the 
revised  MSG  specification.  It  does  not  yet  implement  the  April,  1978 
extension.  The  features  of  the  current  version  are: 

o  Flow  control  is  implemented  for  both  sides. 

o  The  present  TFNEX  limitation  of  20^48  bytes  per  message 
is  larger  than  CCN  can  handle  reliably  with  its  current 
allocation  of  resources  to  the  UCP  region.  Therefore, 

CCN's  MSG  is  being  configured  with  a  maximum  inter-MSG 
message  size  of  1024*  bytes. 

o  An  MSG  process  can  be  materialized  automatically  in 
either  TSO  or  batch.  The  IBM  360  MSG  requires  that  a 
process  specifically  "materialize"  itself  with  a  systeni 
call  to  the  central  MSG.  Included  in  this  materialization 
call  is  an  event  signal  which  will  be  signalled  to 
perform  the  "termination  signal"  function;  however, 
at  present  MSG-central  never  signals  this  event. 

No  mechanism  exists  to  allow  a  process  which  is 
restarting  after  it  crashed  (while  MSG-central  stayed 
up)  to  resume  its  earlier  instance  number. 

o  Both  Sequencing  and  Stream  Marking  have  been  implemented. 

o  MSG  now  includes  the  ability  to  automatically  start  a 
process  under  TSO  when  MSG  initializes  itself  after  a 
system  crash. 

o  Authentication  is  implemented  in  a  manner  which  does  not 
match  the  current  specifications.  The  most  important 
difference  is  that  an  ICP  is  required  to  the  CCN 
authentication  socket. 

o  Binary  direct  connections  may  use  any  byte  size,  but  byte 
sizes  smaller  than  8  bits  are  likely  to  lead  to  problems 
in  determining  the  actual  length  of  the  message. 
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o  It  has  been  decided  to  provide  for  a  manifold  of 

coexisting  IJSnl  systems  on  the  same  AKFANtT  hosts.  This 
requires  that  a  host  support  multiple  NSG's,  using 
different  contact  sockets.  The  3b0  y.^JG  was  implemented 
to  allow  both  a  "production"  and  a  "test"  rISG  to 
coexists,  using  different  contact  sockets. 

It  is  planned  to  modify  MSG  to  allow  more  than  two 
different  MSG's  to  coexist;  this  modification  is  not 
as  trivial  as  it  was  once  believed  to  be. 

o  The  current  process  interface  for  direct  connections 
blocks  internally,  so  that  the  process  does  not 
receive  control  from  an  alarm  until  all  direct 
connection  I/O  completes.  The  direct-connection 
interface  must  be  changed  to  be  non-blocking. 

o  Now  optimizes  the  number  of  idle  server  processes  maintained 
based  on  predicted  system  load. 
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3.2, 3.2  Forenan 

The  IBM  3b0  Foreman  provides  only  a  subset  of  the  features 
defined  in  the  specification,  as  only  features  required  to  support  the 
DISPLAY  tool  are  implemented.  Specifically: 

o  The  360  Foreman  supports  encapsulated  tools  only;  in 
particular,  there  is  no  Foreman-tool  Interface. 

o  Encapsulation  does  not  extend  to  the  file  system. 

Therefore,  NSW  files  can  be  fetched  only 

before  the  tool  starts.  Files  cannot  be  delivered,  as  this 
feature  is  not  required  by  DISPLAY.  This  is 
accomplished  by  the  Foreman  interpreting  a  control  stream 
which  it  receives  in  the  "filename-list"  field  of  the 
FM-BEUINTOUL  command.  A  tool  cannot  dynamically  select  an 
NSW  file. 

o  The  only  tool-control  command  implemented  is  FM-BEGI NTOOL ; 
FM-STAHTTOUL  and  FM-STOPTOOL  are  not  implemented.  Any 
non-zero  value  for  Entvec  is  interpreted  as  1,  i.e,  it 
starts  the  tool  at  the  beginning. 

o  There  is  no  Local  Name  Dictionary  (LND),  and  hence  no 

saving  of  LND's.  FM-OK  is  not  implemented.  No  LND  cleanup 
process  is  started  automatically  after  a  system  crash. 
FM-KEBEGI NTOOL  is  currently  implemented  as  another  name  for 
FM-BEGI NTOOL.  Otherwise,  tool  starting  and  stopping  follow 
the  interim  reliability  scenarios. 


3>2.3>3  File  (’ackag« 


Th«  lUM  3bO  File  Package  implements  substantially  all  of  the 
revised  specification.  A  few  features  have  either  not  been  Implemented 
or  have  been  incorrectly  implemented.  Spec  I  f  ic,i  1  ly  : 

o  All  format  effectors  and  record  control  tokens  of  IL 

ore  Implemented.  However,  the  variable  format  effectors 
HT,  VT,  LF,  and  FF ,  whose  interpretations  are  defined 
for  each  file  by  the  GFL'  are  not  fully  tested  with  the 
Tenex  File  Package. 

o  The  IHM  3b0  File  Package  never  arms  itself  for  alarms,  and 
it  never  sends  an  alarm.  If  an  error  condition  is  found 
during  data  transfer,  the  IbM  360  File  Package  will 
immediately  close  the  connection  (rather  than  send  an  alarm, 
as  called  for  in  the  specifications).  The  File  Package 
has  no  mechanism,  for  reporting  the  status  of  a 
transfer  operation. 

o  The  full  trror  Descriptors  are  not  supplied  by  the  File 
Package,  due  to  PL/PCP  restrictions.  In  particular: 

-  The  list  of  debug  reports  is  always  empty. 

-  Only  one  error  can  be  reported,  the  first 
one  detected. 

-  The  values  of  the  fault  class  and  fault 
number  fields  have  not  been  properly- 
correlated  with  other  Kile  Package 
iwplementat ions. 

-  The  implementation  of  the  Smithsonian 
Astronomical  Date  Standard  is  untested. 

o  A  format  for  family  copies  of  files  which  catinot  be 
described  in  IL  has  not  been  defined  or  irplem.ented 
for  the  IDM  JOO  family.  Hence,  all  net  transmissions, 
regardless  of  family,  use  IL. 

o  A  local  data  set  can  be  accessed  by  the  Kile  Package  only  if 
it  exists  within  a  directory  in  the  NSk  d i r ect ory-gr cup 
(i.e.,  having  the  NSW  charge  number).  Since  there  is 
no  mechanism  to  "connect"  to  a  non-NSw  directory,  the 
password  parameter  is  ignored. 

c  * rcblocking,  is  not  supported;  a  reviuest  to  senu  an 
IL-encoded  file  with  a  transmission  block  size  smaller 
than  the  IL  blocksize  in  which  it  is  recorded  on  disk 
m.ay  fail.  This  is  not  expected  to  be  a  problem,  since 
Kile  Package  transmission  block/sizes  are  expected  to  be 
established  by  gentleman's  agreement  and  not  varied. 

o  binary  l.L.  encodc/docode  has  now  been  tested  .-ind  delnigged 
with  the  TLNEX/loPSc'O  File  Package. 

o  Only  byte  size  S  is  supported  for  data  transfer. 
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3.2. 3.4  Batch  Job  Package 

The  initial  implementation  of  the  CCN/360  Batch  Job  Package 
is  complete,  and  was  released  as  a  component  of  the  candidate  user 
NSW  system  on  November  16,  1976.  This  implementation  completely 
supports  all  BJP  procedures  specified  in  the  revised  Batch  Job  Package 
specification  included  as  Appendix  B  to  this  report.  This  implementation 
has  been  extensively  tested  with  the  corresponding  WMO  version  released 
on  the  same  date,  and  has  no  known  outstancing  deficiencies.  There 
are  currently  seven  batch  tools  installed  in  NSw  which  may  be  run 
by  WMO-BJP.  Only  the  FORTRAN  tool  has  been  extensively  tested  and 
is  known  to  run  and  produce  good  output.  This  testing  deficiency 
is  largely  due  to  the  circumstance  that  the  personnel  responsible 
for  testing  WMO-BJP  are  too  unfamiliar  with  the  ether  tools  to  create 
test  input  for  them. 


3.2. **  MULTICS  THH  Compoinrnla 

The  MULTICS  TUll  remiilns  the  wejkont  part  of  NSo.  llie  conponenls 
were  implemented  to  comply  only  superficially  with  the  specifications. 
The  TUH  components  tiave  been  analyzed  to  a  procedure  Level, 
and  a  preliminary  coriformance  study  h.is  been  written.  t.noup.h  problers 
have  been  fixed  to  justify  the  re- 1  nc  lu  s  i  on  of  Mou'llCS  in  the  user 
system,  with  an  expanded  tool  kit. 

3.2. **.)  MSt; 
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MSCi  IS  a  relatively  stable  MULIICS  eotiponont.  Its  blpitesl 
problem  Is  Its  dependence  on  t*ie  unsupported  l  ASKlMU  sol  tw.trc. 

Unsupported  IteniS  It)  t))e  sped  flc.itlon,  as  documented  on  October  j,  'U/t* 
do  not  appear  to  compromise  the  usability  ol  tlie  MULTICS  TUil  softw.)t  e  -  ti.atiy 
ren.aln  un  imp  lomonted  in  other  I'PII  systems. 

Conf iyiuratlon  control  has  been  improved  by  creatini^  a  cotitact 
socket  table  so  tliat  MULTICS  MSS  t'at)  contact  remote  USW  MSu  ’  s  at  tlie 
correct  socket  numbers. 

3.2.  **.2  Foreman 

The  Foreman  contains  t)ie  pre.itest  number  of  un  Imp  ler  e  n  ted 
Items,  and  is  the  source  of  most  problems  on  the  MULTICS  lliU.  I'he 
Implementation  suffers  from  the  fact  tl)at  it  was  implemenievi  to 
support  tools  wrlttet)  specifically  for  NSW  -  i .  «> .  tools  that  use 
f*Sw'  tool  primitives  -  and  only  later  extended  tt>  support  t  <>o  I 
encapsulation.  In  general,  encapsulation  can  now  be  done,  bul  tt)e 
duality  fo  t))e  encapsul  nt  ion  of  eaeh  Individual  tool  depends  directly 
on  the  ai.iount  of  work  put  it)tc  eaci)  encapsulation. 

Specific  tmprovenients  in  ll)c  currenl  imp  lemetit  at  lot)  are; 

o  Many  W),)  1 1  bup.s  ellmit)aled. 

o  Tool  ter'minatlon  works  j'ssent  1  a  1  ly  as  specified. 

o  Alarm  processing,  has  been  improve)!. 

o  More  tools  .are  et)capsulated  more  reli.tbly. 

3.2.  **.3  File  Hackap.e 

Tl)e  File  I’ackap.e,  like  MSG,  is  a  fairly  reluible  eompot)ent. 
it  conforms  fnlrly  closely  to  l))e  spec  i  fl  cat  iv't),  at)d  supports  tile 
encodement  into  Internu'dlate  Lanpuattc  ,»bout  as  well  as  tl)e  oll)er 
TnH  File  I’acknpes.  binary  file  transfer  to  non-MULT I CG  !)osts  is  ))v)t 
supported,  but  Is  not  re))ul*'«*<l  by  any  currently  Installed  ti'ols. 
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j.2.b  Front  tnd 

The  COMPASS  NSW  Front  tnd  Is  not  much  different  functionally  then 
it  was  a  year  ago,  since  no  major  rewritings  or  addition  of  functions 
has  been  undertaken.  It  is,  however,  both  faster  and  sturdier  than  it 
used  to  be: 

Faster  --  The  Ft  program  now  handles  (most  of)  its  idle  time  by 
interrupt  mechanisms  rather  than  timed  waits;  hence  it  no  longer 
consun.es  any  CPU  tine  between  operations,  and  the  CPU-tiine  cost  of  waiting 
periods  during  operations  has  been  cut  in  half. 

Sturdier  --  Anomalous  conditions,  especially  in  communications 
protocols,  are  detected  more  reliably  and  more  discriminating  responses 
are  made.  All  known  bugs  have  been  corrected. 

Several  subtle  accomodations  have  had  to  be  made  to  the  TUPS-20 
operating  system;  but  these  have  turned  out  to  have  no  effect  in  the 
TENtX  operating  system,  so  that  identical  object-code  files  run  on  the 
two  systems.  Maintaining  compatibility  in  this  way  means,  of  course, 
that  no  advantage  has  yet  been  taken  of  several  of  the  advanced 
features  offered  by  the  newer  TOPS-20. 
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Documentation: 

The  "external  specs"  of  the  Kt  are  In  reasonably  jtood  sljape: 

o  The  Ft  MS(j  Interface  docun.ent,  orij^irially  issued  in 
November  1977,  has  been  corrected  and  updated,  and 
reissued  in  August  I97b.  It  describes  the  format  and 
content  of  all  MSG  P.essaf.es  sent  or  received  by  the  FL. 

o  The  "user  interface"  document  —  the  NS*>  User's  Keference 
manual  --  has  been  extended  and  partially  rewritten,  and 
was  issued  in  November  1978  to  describe  the  commands  and 
operations  available  to  the  user  in  the  NSW  version  J. i 
release. 

Shortcomings: 

It  is  still  possible  for  the  Ft  process  to  "hang"  if  its 
conversational  partner  --  Works  Manager  or  Foreran  —  accepts  an  KSG 
message  but  then  fails  to  reply.  Without  a  moderately  extensive 
rewriting  of  the  programs,  we  arc  faced  with  the  following 
choice  in  this  circumstance: 

(1)  Abort  the  Ft  process,  which  leaves  the  user's  Node 
Kecords  in  a  blocked  state  so  that  he  cannot  log  in 
again ; 

(2)  Stop  waiting  for  the  reply  and  return  to  NSW  command 

level:  this  appears  to  work  for  non-responsi ve 

Foreren,  although  the  tli.:eoul  has  been  set  at  jd 
minutes;  for  Works  Manager  operations,  this 
alternative  leads  to  an  out-of-syncli  situation  from 
which  the  user  cannot  recover,  if  the  tH?lated  reply 
does  eventually  arrive. 

(3)  Walt  indefinitely  for  the  reply,  which  is  what  we  do 
now. 

The  program  can  still  be  made  smaller  and  more  efficient, 
and  the  input-editing  facilities  need  to  be  completed. 
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3.3  NSW  Perfornance 

During  the  reported  period  a  number  of  steps  were  taken  to  improve 
the  overall  performance  of  NSW.  Three  major  avenues  of  approach  were 
taken: 

1.  Memory  use  was  monitored. 

2.  TENEX  was  monitored  while  running  NSW  in  order  to  collect 
statistics  on  the  gross  use  by  NSW  components  of  TENEX  resources 
such  as  CPU  time,  JSYS  monitor  calls,  and  pager  faults. 

3.  Detailed  statistics  were  gathered  on  Works  Manager  CPU  usage. 

Memory  use  was  monitored  in  two  different  ways.  First,  a  memory 
monitoring  tool  called  PAM  was  developed,  and  included  in  many  NSW' 
components.  This  tool,  when  activated,  generates  a  map  of  exactly  which 
virtual  n.emory  pages  were  accessed  at  least  once  between  any  two 
designated  points  in  the  execution  of  a  program.  This  gives  an  accurate 
picture  of  the  total  number  of  memory  pages  that  would  be  required  to 
perform  some  NSW  operation  with  no  page  faults.  Because  the  result  of 
using  PAM  is  a  map  of  exactly  which  pages  were  accessed,  it  is  also 
possible  to  subdivide  memory  use  into  code  and  data  accesses.  Prom  this 
it  is  possible  to  predict  what  the  memory  requirements  would  be  for  an 
NSW  with  a  larger  number  of  concurrent  processes  all  of  which  shared 
code  pages  but  each  of  which  had  its  own  local  memory  area. 

PAM  was  able  to  show  which  pages  were  accessed  at  least  once  during 
an  operation,  but  was  unable  to  show  how  many  times  each  page  was 
accessed.  Thus  the  figures  obtained  are  doubtless  larger  than  the  true 
Working  Set  for  NSW,  in  that  pages  are  counted  which  may  have  been 
accessed  only  once  or  twice  in  an  entire  operation.  In  order  to  get  a 
lower  bound  on  NSW  Working  Set  size,  NSW  was  run  on  a  metered  version  of 
TENEX  and  figures  were  obtained  on  the  Working  Set  size  that  TENEX 
alloted  to  each  NSW  process.  These  figures  represent  a  lower  bound  on 
the  true  Working  Set,  in  that  the  figures  also  sl^owed  clearly  that  the 
TENEX  configuration  on  which  the  tests  were  made  had  insufficient  memory 
to  run  NSW  without  excessive  paging.  Unfortunately  it  is  difficult  to 
extrapolate  from  these  figures  just  what  the  Working  Set  would  be  on  a 
TENEX  with  adequate  memory. 
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Luring  the  reported  period  LDf%'  node  a  rumuer  of  tests  of  overall 
syster.  resource  use  by  UliW.  The  results  of  these  tests  are  described  in 
great  detail  in 
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Hichard  E.  Schantz 

In  addition  to  the  Working  Set  estinates  already  discussed,  these  tests 
showed  that  certain  NSW  processes  were  expending  a  great  deal  of  t  ir.e 
making  JSYS  calls  to  the  operating  systetr.  As  a  result  several  I.O/. 
components,  the  File  Package  in  particular,  were  altered  tc  interact 
with  the  monitor  more  efficiently.  This  resulted  in  a  substantial 
increase  in  File  Package  per forn.ance.  These  it;;pro ven.cn ts  are  discussed 
in  more  detail  in  section  3. 3  of  this  docun.ent. 

These  measurements  of  overall  NSW  component  performance  clearly 
showed  that  the  Works  Manager  was  consuming  a  large  amount  of  CPU  time, 
but  gave  no  clue  as  to  exactly  where  the  time  was  being  spent.  To  get  a 
better  picture  of  the  problem  a  new  performance  tool  for  hCPi.  progra.ms 
was  developed:  PFSTAT.  PFSTAT  takes  samples  of  wall  clock  time,  CPU 
time,  and  pager  time  at  selected  subroutine  call  and  return  points.  Tnc 
result  is  a  detailed  picture  of  what  major  subroutines  were  called  and 
how  much  time  each  took  to  run.  When  PFSTAT  was  applied  to  the  Works 
Manager  it  showed  quite  clearly  that  the  major  probler  was  that  the 
Works  Manager  was  using  the  powerful  but  slow  Information  Retrieval 
System  to  store  all  of  its  tables,  including  those  tables  which  were 
accessed  on  every  call.  Accordingly,  a  new  database  management  syster 
called  the  Works  Manager  Table  Facility  was  developed  to  hold  the  most 
active  Works  Manager  tables,  leaving,  the  Information  Retrieval  System  to 
handle  only  the  NSW  File  Catalogue  for  which  it  was  originally  designed. 
As  a  result,  the  CPU  time  required  by  the  Works  Manager  was  reduced  by  a 
factor  of  **.  The  Works  Manager  Table  Facility  is  described  in  Appendix 
A  of  this  document. 
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*1.  Future  Directions 
U .  1  Overview 

As  noted  in  section  2.3,  we  are  now  in  phase  four  of  NSW 
development.  The  areas  of  greatest  concern  are  improving  reliability 
ana  performance.  Substantial  results  in  the  area  of  performance 
improvement  should  begin  to  be  visible  in  the  user  WSw  system  by 
October,  1978.  Major  effort  on  phase  four  whould  be  over  by  mid  1979. 

The  next  phase  of  NSW  development  should  be  creation  of  a 
production  NSW  system.  This  system  should  exhibit  the  basic 
functionality  already  developed,  as  well  as  the  robustness  and 
responsiveness  now  being  Implemented.  In  addition,  NSW  needs  to  have 
the  packaging,  support,  documentation,  and  capabilities  of  a  finished 
production  system.  Hhase  five  of  NSW  development  will  concentrate  on 
providing  these  features.  We  expect  to  begin  phase  five  in  October, 
1978  by  beginning  the  expansion  of  the  RADC  TOPS-20  NSW  to  support 
the  activities  of  NSW  implementors.  The  first  specific  improvements 
scheduled  are  the  installation  and  testing  of  tools  needed  by  the 
implementors  and  the  addition  of  an  Arpanet  mail  facility.  More 
details  about  specific  features  can  be  found  in  section  N.2. 

In  addition  to  program  improvement,  phase  five  will  include 
the  establishement  of  the  administrative  structure  needed  to  support 
NSW  users,  manage  the  system  configuration,  operate  systems,  determine 
the  priority  of  bug  fixes  and  new  features,  prepare  and  distribute 
documentation,  etc. 

4.2  Components 

In  the  following  subsections  we  describe  the  tasks  to  be 
performed  to  complete  phase  four  of  NSW  development  and  move  into  phase 
five. 

N .2. 1  Core  System 
11.2.1.1.  Works  Manager 

Considerable  effort  must  still  be  devoted  to  completion  of 
phase  four  of  Works  Manager  development.  A  number  of  measurements  of 
Works  Manager  performance  have  been  m.ade  and  analyzed.  Some 
improvements  have  already  been  made,  and  a  substantial  improven;ent  is 
expected  upon  completion  of  the  in-core  Works  Manager  Table  Facility 
(see  section  3.2.1. 1).  More  performance  optimization  is  possible, 
and  more  effort  should  be  devoted  to  measurement,  analyses,  and 
implementations.  Current  effc.ts  at  modeling  should  also  be  continued. 
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In  addition,  certain  portions  of  the  full  scale  NSW 
reliability  plan  should  be  implemented.  While  portions  of  that  plan 
treated  distributed  data  base  synchronization,  other  parts  dealt  with 
issues  of  process  and  network  failure  and  recovery.  These  other 
parts  should  be  implemented.  In  particular,  the  try-retry  mechanism 
and  timing  signals  are  needed.  Moreover,  a  facility  for  archiving 
and  restoring  NSW  files  and  data  bases  should  be  designeu  and 
implemented. 

All  of  these  performance  and  reliability  improvements  coulo 
be  completed  in  1979,  thereby  concluding  phase  four  of  NSW 
development.  Phase  five,  which  is  concerned  with  "productizing"  NSW 
should  begin  for  the  Works  Manager  in  October,  197d.  nhile  the  Works 
Manager  is  substantially  complete,  there  are  a  number  of  extensions 
which  should  be  made.  These  enhanced  capabilities  include: 

o  Arpanet  mail  interface  -  The  procedures  to  support  mail 
systems  (e.g.,  Hermes)  should  be  designed  and  implemented. 

o  Configuration  management  procedures  -  As  noted  in  section 
3.1,  manual  configuration  management  has  already  begun.  As 
more  NSW  development  work  is  done  using  NSW,  it  will  be 
possible  to  automate  configuration  management. 

o  Direct  file  access  -  Use  access  and  read  access:  Add  two 
new  kinds  of  NSW  file  access.  Use  access  means  that  a  user 
has  undisputed  rights  to  an  NS'W  file.  When  he  references 
the  file  he  is  given  the  NSW  file  copy  -  not  a  private  copy. 
Any  alterations  he  makes  are  immediately  reflected  in  the 
file.  Read  access  allows  a  user  to  read  the  actual  NSW 
file  copy  -  not  a  private  copy.  Tlius  it  is  suitable  for 
data  base  files. 

o  Tool  kits  -  When  a  user  runs  a  kit  of  several  tools  on  one 
host,  the  workspace  should  be  left  unchanged  between  eacii 
tool.  Thus,  intern.ediate  files  can  be  passed  from  tool  to 
tool  without  delivery  to  NSW  file  space.  Both  of  these 
features  would  greatly  enhance  and  optimize  the  use  of 
local  tools. 

o  Version  numbers  -  Design  and  implen.ent  a  file  version 

numbering  facility.  This  facility  must  be  rich  enough  to 
support  configuration  management  within  NSW. 

o  History  file  -  Implement  the  Works  Manager  routines  to 

record  information  on  the  History  File.  Design  and  implement 
at  least  some  interesting  management/accounting  routines 
which  access  this  file. 
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o  Full  file  attributes  -  At  present  only  the  filename  portion 
of  the  complete  NSW  filename  can  be  used  for  retrieval. 

Also,  the  use  of  file  attributes  by  tools  is  only  permitted 
for  the  Global  File  Descriptor.  The  implementation  of 
file  attributes  should  be  completed. 

o  Tool  name  extensions  -  The  original  concept  of  complete  tool 
host  transparency  has  proven  unworkable.  Thus,  the  notion 
of  tool  name  should  be  extended  to  allow  (explicit  or 
implicit)  host  selection.  By  using  the  same  mechanism  as 
is  used  for  files,  the  entire  file  lock  system  can  also  be 
used  for  tools. 

o  System  status  commands  -  The  NSW  user  needs  commands 
to  interrogate  system  status  and  configuration: 

What  tools  are  available?  Which  resources  are  up? 

What  is  the  system  load? 

This  list  of  WM  extensions  by  no  means  exhausts  the  list  of  possible 
capabilities.  These  extensions  could  be  scheduled  for  implementation 
in  1979;  other  features  will  undoubtedly  be  suggested  as  NSW 
implementors  begin  to  use  NSW  for  their  own  development  efforts. 
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‘t.2.K2  Works  Xanaper  Operator 

Very  little  needs  to  be  done  to  cotnplete  phase  four  of  WorKs 
.Vanaper  Operator  development.  The  r.cchanisi.  used  for  batch  job 
submission  has  proven  to  be  reliable  in  the  face  ot  oorks  Xat.iaper, 
network,  and  batch  host  failure.  Various  detaxl  ir.pro veii.ents  are 
required,  but  these  will  not  consume  much  effort.  Moreover, 
performance  of  the  Works  Manaper  Operator  has  not  been  a  problei,, 
since  it  operates  in  background  mode.  The  elapsed  time  for  its 
operation  is  only  a  miniscule  fraction  oJ  total  batch  job  execution 
time.  Some  effort  should  be  devoted  to  carefully  measuring  and 
reducing  CPU  utilization  because  of  tP.e  possible  effect  on 
interactive  NSW  components,  but  this  is  not  a  high  priority  task. 
Documentation  of  the  Works  Manager  operator  should 
be  completed  in  the  near  future. 

In  phase  five,  it  will  be  necessary  to  extend  the  functional 
capabilities  of  the  Works  manager  Operator.  Such  extensions  include: 

o  background  file  motion  -  The  delays  perceived  by  tl  e  user 
when  files  must  be  transferred  or  reforiatted  can  be 
significantly  reduced  by  performing  such  actions  in 
background  mode. 

o  Job  chaining  -  A  desirable  extension  is  to  allow  n.ultiple 
batch  tools  to  be  run  in  sequence.  Suc(i  a  sequence  should 
not  be  limited  to  just  one  batch  host. 

o  Device  I/O  -  A  variant  of  background  file  potion  is  to  b.ave 
WMC  control  input  and  output  from  devices  local  to  a  user. 

o  Support  of  small  (or  non-l.'So)  batch  hosts  -  Some  hosts  nay 
be  too  small  to  support  a  batch  Job  Package.  Also,  some 
hosts  may  be  desirable  as  batch  hosts  but  may  not  have  tlie 
required  NSW  components  (MSG,  file  Package).  The  Works 
Manager  Operator  should  be  extended  to  use  existing  Ar,.anet 
protocols  (FTP,  KJE)  to  submit  batch  jobs  to  such,  hosts. 


4.i?.2  TtNEX/TOPS-20  TbH 


<4.2.2.  1  KSG 

Very  little  additional  effort  is  required  for  TiiNEX/IOPS-20 
MSG.  There  are  still  some  outstanding  MSG  design  issues: 

o  Details  of  MSG-MSG  authentication  -  The  general  mechanism  is 
as  specified  in  the  MSG  design  document  of  December,  1976. 
However,  the  details  of  the  AHPANET  protocol  exchanges  are 
being  re-examined. 

o  Maximum  message  size  -  The  maximuiii  message  size  is  specified 
to  be  05536  bytes  (2**10).  No  implerentation  will  accept 
r.ossaned  that  large.  At  present  there  is  informal  agreement 
to  limit  message  size  to  at  most  2044a  bytes. 

o  Process  creation  -  This  issue  was  skirted  in  the  original 
specification.  However,  a  satisfactory  solution  must  be 
found  which  balances  the  dynamic  cost  of  process 
initialization  and  the  static  cost  of  maintaining  unused 
ready-to-run  processes. 

o  Optimization  techniques  -  Compound  operations  like  "send 
'<-■  then  receive"  should  be  added,  and  some  MSG  code  could 

be  included  inside  those  processes  run  under  MSG  to  reduce 
context  switching. 

o  Reliability  techniques  -  Allow  for  multiple  hosts  to  be 
considered  as  recipients  of  generically  addressed 
messages,  so  that  the  system  can  function  better  in  the 
presence  of  "downed"  hosts.  The  HSiV  Fault  Logger  is  an 
example  of  a  process  which  could  make  good  use  of  such 
a  feature. 

Once  these  design  issues  are  resolved,  TENEX/TOPS-20  MSG  must  be 
modified  to  incorporate  them.  In  addition,  recent  performance 
measurements  have  suggested  a  number  of  improvements  which  should  be 
implemented. 


‘*.2.2.2  Korerian 


Completion  of  phase  four  for  the  itNtX/ fOFS-i'O  Foreman 
involves  two  tasks.  T)ie  first  is  the  intepr.nticn  of  the  re  1  i.iL  i  1  ity 
nechansims  described  in  tiie  full  scale  IISw  reliability  plan  -  in 
particular,  the  try-retry  m.echanlsr.  and  timlftp  signals.  The  second 
task  is  improving  iorennn  performance  with  respect  to  CPU  utllizotion 
and  paging  requirements.  A  number  of  such  improvements  h.»ve  been 
suggested  by  the  me.isurcnicnts  and  analysis  already  done.  In 
addition,  documentation  of  the  Korem.an  must  be  produced.  This 
documentation  should  be  complete  by  February,  19. TV. 

Although  the  TtNKX/TOPS-i?U  Foreman  substantially  implenents 
the  specification,  there  are  a  number  of  additional  capabilities  which 
should  be  added.  Some  of  these  capabilities  are  implied  by  the 
specification,  and  some  arc  additional.  These  capabilities  include: 

o  Permanent  integration  of  the  ’lUPS-i'O  mountable 
structures  interface 

o  Implementation  of  the  solution  to  the  saved  LUl 
workspace  managenent  probler. 

o  Coordinated  dlorks  Manager/Foren'an  protocol  design 
and  implementation  to  have  common  data  base  items 
reflect  local  resource  managemont  decisions 

o  Implementation  of  tool-specific  encapsulated  tool 
interfaces  to  handle  tool  peculiarities  and 
improve  performance 

o  Direct  tool  interface  to  NSW  functions  -  i.o., 
non-encapsul ated  tool  interface 

o  Design  and  imp lenentat ion  of  a  Foreman  modified 
for  on-line  tool  debugging 

o  Design  and  imp  1  erne n tat ioti  of  Foremati  extensions 
for  tool  kits. 

o  Incorporation  of  some  of  the  file  package's  functionality 
in  order  to  optimize  file  letclilng  and  delivery  operations. 


File  Package 


f  11  TFU^■x/TOPS-20  File  Package  is  essentially 

Functionally,  the  TEHtX/TUPb  ^  encode/decode  for  binary 
U.  including  implementation  of  IL  encooe/oe  o  ^ 

Tcomplete  performance  =  changes  which  should  halve 

Ition  of  File  Package  documentation. 

I  U  phaae  ri.e  ‘1;'  ““‘j;.f,!'\fe°S^Sa'’ra?a'‘'?  ‘aSd. 

t'rellS“cC£Hur?rrn»?aarn“?rara‘li;L’rar«  ana 

Rented. 
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‘*.2.2.3  File  Package 

Functionally,  the  TtHEX/TOPS-20  File  Package  is  essentially 
complete,  including  Implementation  of  IL  encode/decode  for  binary 
files.  Complete  performance  measurement  and  analysis  must  be  done. 
Preliminary  measurements  have  suggested  some  changes  which  should  halve 
CPU  utilization.  Additional  optimization  should  be  performed.  Some 
of  the  concepts  of  the  reliability  plan  could  also  be  extended  to  the 
File  Package.  The  other  major  task  to  be  completed  in  phase  four  is 
production  of  File  Package  documentation. 

In  phase  five,  the  capability  which  should  be  added  is 
optimization  of  cross-net  file  transfers.  The  baud  rate  of  such 
transfers  should  be  improved  and  automatic  restart  and  backup 
procedures  in  case  of  file  transmission  errors  sliould  be  designed  and 
implemented. 
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>*.2.3  Il'M  360  T13H 

All  lUM  360  co.nponenta  need  to  be  docuriented. 

<4.2.3. 1  MSG 

The  IL'N  360  MSG  should  have  the  deficiencies  nentloned  in 
section  3»2.3.1  repaired.  In  addition  perforrance  should  be  pleasured 
and  improved.  As  the  MSG  desip.n  issues  nentloned  in  section  «.2.2.1 
are  resolved,  the  IHM  360  MoG  should  be  nodlfled  to  reflect  those 
resolutions. 


I 


I 
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4.2. j.i;  Foreman 

The  IHM  360  Foreman  implements  only  a  small  subset  of  the 
Foreman  specification.  To  the  extent  that  there  is  user  interest  in 
interactive  tools  on  IBM  360  hosts,  the  Foreman  should  be  extended  to 
implement  the  entire  specification. 

4.2. 3. 3  File  Package 

The  IBM  360  File  Package  is  essentially  complete.  A  few  minor 
tasks  remain  to  be  done  (see  section  3. 2. 3. 3),  and  these  should  be 
completed.  Performance  measurement,  analysis,  and  improvement  should 
be  done.  Optimization  of  cross-net  file  transfers  should  be  done  in 
conjunction  with  the  TENEX/TOPS-20  File  Package. 

4.2. 3.4  Batch  Job  Package 

No  further  effort  on  this  component  seems  necessary. 


MULTICS  TbH 


As  noted  in  section  the  components  of  the  KULTICS  TbH 

have  been  baselined.  It  is  now  apparent  that  considerable 
effort  must  be  devoted  to  making  the  Foreman  implement  the 
specification.  MSG  and  the  File  Package  implementations  are  operating 
according  to  specification.  All  MULTICS  components  need  to  be 
documented. 
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4,2.5  Front  tnd 

Functionally,  the  TENEX/TOPS-20  Front  Lnd  is  essentially 
complete.  It  has  also  been  completely  instrumented.  Measurements 
have  been  taken  and  analyzed.  While  some  level  of  ad  hoc  performance 
improvement  is  possible,  the  current  Front  End,  which  started  as  only 
a  debugging  tool,  must  be  completely  restructured  in  order  to  obtain 
a  satisfactory  level  of  performance.  The  Front  End  is  implemented  as 
a  multi-fork  process.  Almost  all  of  these  multiple  forks  can  be 
collapsed  into  a  single  fork.  This  will  decrease  both  CPU 
utilization  and  space  requirements.  Front  End  documentation  should 
also  be  completed. 

An  additional  path  toward  optimizing  Front  Lnd  performance  is 
to  split  the  Front  End  into  the  "switcher"  and  "parser"  functions.  A 
document  describing  the  functionality  of  the  split  was  produced 
in  July,  1978.  Since  this  split  is  orthogonal  to  the  current  fork 
structure,  the  reduction  of  the  number  of  forks  should  be  completed 
before  considering  the  implementation  of  the  split  Front  End. 


Parts  of  the  full  scale  NSW  reliability  plan  also  must  be 
implemented  in  the  TtNEX/TOPS-20  Front  End  -  in  particular,  the 
try/retry  mechanisr.  and  timing  signals.  With  the  completion  of  these 
performance  and  reliability  tasks,  phase  four  of  Front  End  dcve  lopi.ent 
will  be  finished. 

There  are  several  Front  End  enhancements  which  should  be 
accomplished  as  part  of  phase  five  of  NSW  development.  These 
enhancements  include; 

o  Optimization  of  local  tool  use  -  Some  advantage  should 
be  taken  when  the  Front  End  and  task  are  on  the  same 
host.  The  split  Front  End  is  an  .  oproach  to  this 
optimization. 

o  Macro  facility  -  An  NSW  macro  facility  should  be  designed 
and  implemented.  This  would  permit  users  to  execute  a 
number  of  system/tool  commands  with  a  single  command. 

It  should  be  able  to  execute  either  online  or  in 
background  mode. 

o  User  profiler  -  Use  of  the  user  profile  to  tailor 

terminal  handling  should  be  designed  and  Inplcmenteo. 

o  Access  to  text  files  -  Currently  the  Front  End  can't  access 
NSW  files  -  if  the  user  wishes  a  file  listed,  an  editor 
or  display  tool  must  be  invoked.  The  Front  End  should  be 
able  to  list  the  file  itself,  and  additionally  stiould 
be  able  to  take  commands  from  a  file  to  ii’ipler.ent  the 
"Kunfile  capability  discussed  later  (see  *4.3.3). 


^.3  Functional  Testing 
4.3.1.  History 

COMPASS  has  been  responsible  since  mid  1977  for  functional 
testing  of  t4SW  as  outlined  in  "iJational  Software  Works  Test  Plan", 

May  9,  1977,  published  by  KAi)C/ISCP.  Since  that  date,  COMPASS  has 
run  a  manual  functional  test  script  on  each  version  of  the  USW  system 
which  was  a  candidate  for  release  as  a  new  user  system. 

The  initial  version  of  this  script  was  restricted  to  the 
level  of  test  specified  in  KADC/ISCP  Test  Plan  -  to  determine  if  NSW 
components  functioned  as  specified  in  a  friendly  environment. 

Testing  was  limited  to  ensuring  that  all  components  in  the  test 
configuration  (including  remote  TBH's)  responded  correctly  to  correct 
user  input,  and  little  effort  was  made  to  test  the  system  in  the  face 
of  incorrect  input  or  errors  in  the  system  configuration.  NSW 
systems  tested  to  only  this  level  tended  to  behave  erratically. 
Therefore  the  functional  test  script  was  soon  extended  with  a  number 
of  ad  hoc  tests  of  fJSW's  capacity  to  cope  with  user  and  configuration 
errors.  This  is  the  level  of  testing  to  which  the  candidate  user 
system  released  on  November  16,  1978  was  subjected. 

COMPASS  has  been  mandated  to  develop  and  apply  a  more  carefully 
designed  and  rigorous  level  of  functional  testing  to  future  NSW  system 
releases.  The  remainder  of  this  section  describes  the  direction 
for  this  future  testing. 


Functional  tests  -  content 

We  define  "functional  testing"  as  follows:  to  determine  whethc 
u  set  of  NSW  components  offered  as  a  new  systei..  release  meet  the 
following  requirements: 

(1)  Can  be  correctly  configured  as  an  operational  NSW  system, 
with  all  core  and  TBit  components  in  a  correct  initial 
state  for  operation. 

(2)  All  functions  specified  to  be  present  in  the  release  perfor 
as  expected  for  correct  input,  and  all  components  in  the 
configuration  function  as  specified  for  cor; ect  input. 

(3)  All  error  detection  and  reporting  fur;ctions  work  as 
expected  for  representative  incorrect  (user)  input. 

All  components  report  and  recover  fron.  user  induced 
errors  as  specified. 

(4)  The  interim,  reliability  scenarios  perform  as  specified. 

(5)  The  system,  recovers  from  configuration  failures 

(e.g.  TBM  crashes)  to  the  extent  specified  and  expected 
for  the  release. 

This  testing  includes  complete  tests  for  the  delivery  system, 
for  tools  at  eaci;  TBH  -  Foreman,  File  Package,  batch  Job  Package, 
etc  -  but  does  not  cover  acceptance  of  any  tools. 

The  test  scripts  will  be  structured  into  a  series  of 
levels;  the  first  level  will  test'  the  least  functionaity  ana  the 
least  complex  core  of  the  configuration.  tach  succeeding  level  will 
test  more  functionality  and/or  more  of  the  system  conf iguration. 


The  genernl  contents  of  the  scripts  will  be  as  follows: 
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Level  0:  Set  up  the  complete  system  configuration,  and 

verify  that  all  components  are  in  a  proper  initial 
executive  and  communications  state. 

Level  1:  Test  core  system:  all  components  local  on  Works 
Works  Manager  host. 

(a)  Test  all  possible  NSW'  command  paths  with  correct 
input  in  the  following  order: 

i.  LOGIN,  MCVL,  CHANGE  password,  LOGOUT. 

ii.  Project  management  tools:  nodes,  assign 
rights,  etc. 

iii.  ALTER  comamnd  -  SCOPE  manipulation. 

iv.  Kile  commands  -  NET,  RENAME,  COPY,  DELETE, 
SEMAPHORE.  Local  file  transfers  only. 

V.  Enter  a  batch  Job.  (Processing  deferred). 

VI.  Use  a  local  interactive  tool.  Test 
slewing,  multiple  tools,  RESUME. 

All  recognition  and  completion  features  of  the 

Front  End  are  to  be  tested. 

(b)  Recapitulate  relevant  sections  of  (a),  with 
representative  errors  on  input.  The  error 
detection  and  reporting  facilities  of  the  local 
components  are  to  be  tested  in  the  following  order: 

i.  Front  End 

ii.  Works  Manager 

iii.  File  Package 

iv.  Foreman 

(c)  Where  appropriate  in  (a)  and  (b),  the  operation 

of  tlie  Checkpointer  is  to  be  monitored,  and  message 
and  error  logging  is  to  be  monitored. 
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Level  k':  Test  ttie  distributert  system.:  at  least  one  instance 

ot'  each  TUH  family  to  be  involved  in  the  confi^ur.itiori. 

(a)  lile  transfer  tests 

i.  Test  family  transfers,  where  available. 

Currently  limited  to  TtULX/TuPSkU  hosts 
due  to  lack  of  multiple  host  resources. 

ii.  Test  non-f and  ly  transfers.  At  least 
one  text  file  transfer  back  and  lorth 
between  each  fatid  ly  pair  in  confitjration, 
and  one  round-robin  transfer  in  a  ciioin  includinf 
all  families.  Multiply  translated  files 
must  be  identical  under  Intermediate  Lanruafc 
semantic  specification.  At  least  one 
binary  file  transfer  of  each  defined  type. 

(b)  TUH  test 

i.  execute  a  batch  job  at  each  i3io:l.  Monitor 
performance  of  rtorks  Manager  operator 
and  batch  Joo  processor  for  each  job. 

ii.  Execute  one  interactive  tool  at  each  TuH. 

Level  of  test  identical  to  1  (a)  vi. 

(c)  Recapitulate  (a)  and  (b)  introducing 
representative  errors  in  user  input. 

Level  5:  Test  interir  reliability  scenarios.  Induce  cacn 
error  condition  covered  by  interim  reliatility 
plan,  and  monitor  all  com;ponents  involved  for 
correct  behavior. 

(a)  initial  test  will  be  for  the  core  system  only, 
particularly  to  test  correct  behavior  of 
Works  Manager. 

(b)  Test  of  foreman  capability  for  each  ToH.  Induce 
only  those  failures  which  test  the  Kore;nan*s 
role  in  the  reliatility  scenarios. 

Level  Test  system  response  to  induced  configuration 

failures.  beyond  checking  response  to  "crashed" 

Ibh  (NbW  taken  down),  the  content  of  this  test 
level  is  to  bo  specified. 


‘1.3.3.  Kuncticnal  tests  -  loettiodology 


It  will  be  necessary  to  automate  these  tests  as  much  as 
possible  bcth  to  avoid  expending  excessive  professional  staff  time 
on  them,  and  to  make  the  tests  reliably  repeatable.  COMPASS 
has  investigated  three  classes  of  tools  wnich  can  assist  this 
.  jtomoticn  effort: 


1.  Hun  file  facilities  external  to  NSW 


TtKEX  RUNUL 
TCPS2C  TAKt 

Tl.LNtT  take,  input,  f  rom.  fi  le 


2.  Kun  file  facilities  within  l.Sw' 


Front  Fnd  KUNFILL  command 


3.  Production  (syntactic  rule)  systems 


hITA 


1.  Hun  file  facilities  external  to  NSW 


Th.e  tools  listed  are  all  basically  similar.  tach  has  the 
advantage  of  being  familiar,  tested  and  straightforward.  All 
lack  a  sufficiently  sophisticated  means  of  synchronizing 
their  input  to  the  processes  they  control  with  what  is  in  fact 
happening.  The  synchrony  problem  limits  these  tools  to  situations 
in  which  no  slewing  between  TFLNET  connections  is  done.  This 
excludes  any  testing  of  NSW  tools,  and  makes  changing  TELNET  conversational 
partners  to  monitor  configuration  status  changing  unreliable. 


Hun  file  facilities  within  NSW 


Provision  of  a  RUNFILE  command  has  one  outstanding  advantage: 
the  Front  End  is  always  aware  of  the  identity  of  the  user's  conversational 
partner-NSW  command  processor,  HELP  call,  or  tool  -  and  is  thus 
perfectly  placed  tc  control  the  synchronization  of  command  file  with  the 
actual  behavior  of  NSW.  An  additional  advantage  is  that  we  can  add 
desired  features  to  this  facility  os  needed,  but  must  accept 
the  others  as  they  are.  The  disadvantages  are  that  this  facility 
has  tc  be  designed,  implemented  and  tested;  and  that  it  can  only 
autcrate  the  user  input  portions  of  the  test  scripts. 


3.  Production  systems  -  .RITA 


RITA  has  the  advantage  that  it  can  handle  both  user  input 
and  configuration  management  with  a  sufficiently  rich  rule  set. 

Our  studies  indicate  that  the  development  of  such  a  rule  set  would 
be  a  demanding  job.  A  n;ore  significant  problem  is  that  TENEX 
RITA  is  likely  to  consume  excessive  CPU  rccources  to  run 
a  rule  set  as  complex  as  that  needed  by  NSa. 


Proposed  Methodolopy 


We  propose  ’•hat  a  wiixture  of  rranual  testine  and  the  use  ol 
two  of  the  tools  de-cribec  above  be  used  to  run  the  functional  tests 
The  rri  X  would  be  as  follows: 


Use  KITA  to  set  up  and  initialize  the  .JUn  conf iguration 
for  each  level  test,  and  confirm  that  the  initialization 
is  correct. 


Use  .NSh  RUNPILL  to  autonate  all  user  input  to  test  Lcv’ls 
1,  2,  and  3.  The  hU.*FILE  facility  will  have  some  or  all  c 
the  following  features: 


(i).  Ability  to  interrupt 


(ii).  A  synchronization  scher.e 


(iii).  HELP  from  attached  user  if  synchronization 
failure  occurs 


(iv).  A  PAUSE  feature 


(v).  A  macro  feature  -  string  and/or  file  name 
binding  at  run  time. 


A  "learr.inf,"  feature  which  will  allo’w 
the  Front  End  to  do  most  of  the  work  oi' 
turning  a  manual  script  into  a  command  file 
(speculative) . 


Use  manual  scripts  for  much  of  level  3  testing  a;id  most  of 
level  y  testing.  Probe  system  status  and  ji.onitor  co:c.po:ient 
operation  as  requirec  during  Level  l,  2,  ana  3  testing. 
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Miscellaneous 

There  arc  additional  tasks  to  be  undertaken  which  do  not  fall 
within  the  scope  of  a  single  component.  One  major  effort,  the  creation 
of  an  administrative  structure  for  HSW,  was  mentioned  in  section  *t.1. 

In  this  section  we  list  some  additional  efforts: 

o  Help  facility  -  an  online  help  mechanism  for  HSW  users 
should  be  designed  and  implemented.  This  should  probably 
look  like  a  tool  within  NSW. 

o  distributed  system  debugger  -  It  should  be  possible  to 
debug  a  distributed  system  like  HSW  from  within  HSW. 

An  appropriate  debugger  should  be  designed  and 
implemented.  This  will  almost  certainly  require 
changes  to  the  Works  Manager  and  foreman  components, 
and  possibly  to  MSG  also. 

o  Fault  logger  -  An  NSW  wide  component  for  logging  all 
error  messages  should  be  designed  and  implemented. 

o  Automated  testing  -  The  functional  and  stress/regression 
testing  of  NSW  test  and  user  systems  should  be 
automated. 

o  Management  tools  -  Tools  for  manipulating  the  project 
tree  are  available  in  rudinientary  form.  These  should 
be  improved,  and  additional  tools  for  accessing  the 
.History  file,  report  generation,  etc.  designed  and 
implemented. 

o  Operators  tools  -  A  tool  kit  for  the  user  system 
operator  to  at  least  partially  automate  data  base 
cleanup,  system  starting,  etc.  should  be  designed 
and  implemented. 

o  Tool  installation  -  Install,  test,  and  document  more  NSW 
tools.  In  particular,  install  a  tool  kit  adequate  for 
HSW  implementors. 
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Viorks  ’^unaB^r  lable  Facility 


As  of  Movember  1977,  MSw  had  progrcssca  to  tlie  point  where  it  was 
sufficiently  robust  and  complete  to  allow  serious  use.  However,  it  was 
very  slow,  even  with  only  one  or  two  users  lop.ned  in.  Kore  than  two 
users  was  clearly  out  of  the  question. 

It  was  felt  that  part  of  the  problet..  was  due  to  the  difficulty  of 
iT.plonentinR  an  interprocess  protocol  such  as  M3G  on  top  of  the  standard 
lEHEX  monitor.  In  addition,  it  was  known  that  the  physical  memory  on 
the  host  machine  was  inadequate  to  support  a  mininal  HSiV  v^orkinj',  set. 

It  was  also  clear,  however,  that  a  great  deal  of  the  problem  was  simply 
that  it  took  a  lot  of  CPU  processing  to  perforn:  any  user  coiaiuand. 

Ir.  particular,  the  Works  Vanaper  r<;quired  a  lot  of  CPU  time. 

Some  relatively  simple  changes  were  made  in  1977  that  speeded 
up  the  Works  Manager  by  a  factor  of  two.  As  a  result.  Works  Manager  CPU 
usage  per  procedure  call  was  now  comparable  to  that  of  other  l.SW 
components.  Unlike  the  File  Package,  the  Foreman,  and  WHO,  however,  the 
Works  Manager  participates  in  alnost  every  interaction  with  the  user. 
Thus  while  the  'Works  Manager  was  not  always  the  worst  CPU  time  burner 
per  call,  it  was  certainly  the  worst  per  user  session  due  to  tlie  large 
nu:»ber  of  calls  made  on  it. 

To  give  some  perspective  on  the  problem,  figure  1  shows  the  amount 
of  CPU  time  consuned  by  the  1977  'Works  Manager  during  the  indicated 
procedure  calls. 


CPU  TIME  HLC.'UIKL'D  TO  PlRFORM  «'0KKS  MAMAOEH  FRUCh'O'Jlit  CALLS 
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MARCH  197« 


Procedure 

CPU  time  (seconds) 

LOGIR 

1.2 

R'JNTOOL 

2.0 

PUT 

2.8 

FiG-  ■' 
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Dy  Xarch  of  iS7b  it  becare  possible  for  us  to  consider  making  a 
concerted  attack  on  the  problem  of  Works  Manager  performance.  Hot  only 
was  there  manpower  available  to  work  on  the  problem  but  also  PFSTAT,  our 
performance  tt.casuring  tool  for  bCPL  programs,  had  been  developed  to  the 
point  where  it  was  adequate  to  the  task  of  pinpointing  the  sources  of 
the  problem. 


V.'e  made  a  number  of  tests  of  various  Works  Manager  procedures,  and 
a  pattern  quickly  became  apparent:  the  Works  Manager  was  spending  most 
of  its  tire  raking  calls  on  the  Information  Ketrieval  System.  This 
confirmed  what  we  had  already  suspected,  since  earlier  tests  on  the 
Information  hetrieval  System,  had  shown  that  table  access  calls  were  so 
expensive  by  themselves  that  there  would  be  little  CPU  time  left  in  most 
..'orks  Manager  routines  to  assign  to  any  other  cause. 


The  Information  Ketrieval  System  itself  will  neec  some  substantial 
opt ir.i zat i cn  sometime  in  the  future.  The  primary  problem  in  March  of 
l‘j7d,  however,  was  simply  that  the  Information  Ketrieval  System  was 
originally  designed  tc  support  only  the  NSW  File  Catalogue,  and  the  File 
Catalogue  has  substantially  different  charac'  iristics  from  other  Works 
Manager  tables.  Figure  2  contrasts  the  differences  between  the  l4SW  File 
Catalogue  and  the  other  Works  Manager  Tables. 


CtlARACTCRISTICS  OF  KO-,  FILE  CATALUOUE 
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1.  Very  large  number  of  items  to  store. 

2.  Infrequent  access  to  any  one  item. 

3.  Retrieval  by  keyword. 


CHARACTERISTICS  OF  CTllLR  WORKS  rtAI.AOER  TAuLLS 

1.  Small  number  of  items  --  most  item.s  could  fit  in  process  virtual 
merory  for  a  nedium  sizeo  (100  node)  RS... 

t.  Frequent  access  to  many  itemis. 

3.  Retrieval  not  only  by  keyword  but  also  by  para-^oter  value. 


The  N3't  File  Catalogue  contains  a  potentially  large  number  of 
Items.  Generally,  no  Individual  Item  In  this  database  will  be  accessed 
very  often.  Item  retrieval  Is  by  keyword:  "Give  me  all  files  whose 
names  start  with  COKFASS.SLUIZtK  and  which  also  contain  WALUO".  Because 
people  are  relatively  inventive,  there  is  expected  to  be  a  large  numuer 
of  keywords. 


However,  the  other 
pattern.  They  contain 
an  i\SW  of  moderate  size 
into  part  of  a  process 
tables  will  be  accessed 
tliese  tables  is  differe 
structure.  Furthermore 
parameter  values  instca 
may  wish  to  retrieve  al 
arc  at  least  20  minutes 


'Works  Manager  tables  fit 
a  relatively  snail  number 
,  say  100  user  nodes,  all 
virtual  memory.  Most  of 
frequently.  Finally,  th 
nt.  Most  items  have  simp 
,  iter  name  elements  must 
d  of  as  keywords.  For  ex 
1  Leleted  File  tntries  th 
old. 


quite  a  different 
of  items.  In  fact,  for 
these  tables  could  fit 
the  iteiTis  in  these 
e  name  structure  of 
le  names  of  fixed 
at  times  be  used  as 
ample,  the  Works  Manager 
at  have  timestamps  which 
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The  time  schedule  was  met  in  the  following  manner:  First,  the 
design  was  deliberately  made  quite  standard.  The  algorithms  and  data 
structures  used  were  ones  which  were  known  to  be  simple,  proven,  and 
flexible.  NSW  overall  may  be  a  research  project,  but  the  Works  Manager 
Table  Facility  certainly  was  not.  Second,  once  the  design  was  complete 
the  implementation  was  done  to  cost.  Where  time  did  not  permit  an 
optimal  implementation  of  some  part  of  the  design,  simpler  data 
structures  and  algorithms  were  employed  whicli  could  easily  be  replaced 
later.  For  example: 

1.  Exclusive  locks  were  used  for  concurrency  control.  Now,  an 
exclusive  lock  provides  excessive  protection;  for  exai.ple,  it 
prevents  two  separate  processes  from  reading  the  same  element 
simultaneously.  However,  the  way  the  database  is  structured 
there  should  be  few  collisions. 

2.  Singly-threaded  lists  and  linear  scans  were  used  instead  of  more 
complex  structures  and  faster  scans.  As  the  figures  will  show 
later,  the  Works  Manager  Table  Facility  that  resulted  was  still 
adequately  fast. 

3.  Fixed  allocations  were  used.  For  example,  whenever  an  online 
database  is  set  up  the  maximum  number  of  table  entry  slots 
needed  must  be  preallocated.  This  wastes  space  in  the  database. 
Code  could  be  added  later  which  could  dynamically  reconfigure  a 
table  header  whenever  the  need  arose,  allowing  preallocated 
table  header  space  to  be  made  much  smialler. 

Finally,  the  time  schedule  was  miet  by  deferring  implementation  of 
parts  of  the  design  not  needed  immediately,  in  particular  overflow 
storage  and  checkpointing.  These  parts  were  implemented  later,  in 
version  2. 

In  evaluating  these  implementation  shortcuts,  we  must  remem.ber  that 
the  goal  is  a  fast  Works  Manager,  not  necessarily  a  fast  Table  F'acility. 
For  example,  we  could  create  a  version  3  with  dynamic  reconfiguration  of 
table  entries  in  order  to  save  table  space.  However  we  could  use  the 
same  time  instead  to  make  changes  to  other  parts  of  the  Works  Manager. 
These  other  changes  might  well  save  a  great  deal  more  table  space  for 
about  the  same  coding  effort. 
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Version  i  of  the  Works  ‘-lanager  Table  Facility  was  i'irst 
used  in  a  'Works  Manager  in  aup.ust  of  I'^Yd.  at  the  highest  logical  level, 
it  iir.pler.ents  a  table  structure  siir;ilar  to  the  one  already  supported  by 
the  Information  Ketrieval  liystem.  Tlie  major  differences  between  the  two 
systems  are  confined  to  lower  lofical  levels. 

The  database  consists  of  a  number  of  Works  Xanar.er  Tables, 
each  table  consists  of  a  set  of  table  entries,  for  example  Active 
'dser  entries  or  I.ode  entries.  A  table  etitry  is  the  basic  unit  of 
transaction  in  the  'Aorks  .’’.anafier  Table  Facility.  Ar.  entry  is  composed 
of : 

1.  An  entry  name  consistint'  of  a  list  of  prrameter  values. 

2.  A  body,  which  at  this  level  of  detail  is  n.oth.in,'  more  t.ian  a 
block  of  arbitrary  data. 

3.  A  set  of  cxter.ncl  loc'^s.  These  Iccirs  are  useo  by  '..or,;s  Vansfer 
instances  to  coordinate  their  use  of  table  entries.  Ti.ey  are 
not  used  by  the  Works  Vanaf.er  Table  Facility  itself. 

Parameters  in  an  entry  na.me  can  be  ci.aracter  strinp^s,  integers,  or 
timestamps.  In  the  Information  Ketrieval  by  stem,  all  parameters  were 
character  strings.  The  shape  of  an  entry  name,  that  is,  the  tju:;oer  oi 
paran.eters  and  the  type  of  each,  varies  from  table  to  tablo  but  is  the 
same  for  all  entries  in  a  given  table.  In  the  Information  i.etrieval 
System,  on  the  other  hand,  different  entries  in  the  same  table  could 
have  different  numbers  of  parameters.  In  both  systems,  however,  the 
entry  names  are  the  sole  means  of  choosing  one  entry  over  another  in  a 
retrieval. 

The  online  database  is  a  single  block  of  data,  kept  in  a'  single 
TFUEX  file.  All  processes  that  access  this  database  .map  a  portion 
of  their  virtual  memory  down  onto  this  file.  Thus  each  process  sees 
the  database  as  a  part  of  its  own  memory. 

This  single  block  of  memory  is  divided  into  variable-sized 
blocks.  Each  block  is  kept  track  of  by  a  two  word  header  whic.:  is 
separate  frot:i  the  block.  There  is  a  singly-threaded  list  of  free  blocks 
arranged  in  order  of  increasing  address  in  memory.  .nhen  the  database 
is  first  created,  this  list  contains  one  single,  large  bloc!;.  As  memory 
is  used  and  then  later  released,  the  list  will  grow.  To  satisfy  a  request 
for  memory,  the  list  is  scanned  to  find  the  sr.allest  block  that  is 
large  enough.  Generally,  the  requisite  memory  is  then  split  off  f ror;  the 
block  unless  the  block  is  only  sliglitly  larger  to  begin  uitl..  The  address 
of  the  block  is  returned,  not  the  address  of  the  header  that  defines  it. 

Blocks  which  are  in  use  are  threaded  onto  another  list.  This  list 
is  necessary  in  order  to  find  the  block  header  when  tne  block  is  later 
freed,  as  higher-level  routines  know  only  the  address  of  the  block,  riut 
the  address  of  the  header. 
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Cne  of  the  more  significant  differences  between  the  Works  Manager 
Table  Facility  and  the  Information  iietrieval  System  is  that  in  the  Works 
■‘J'.narer  Table  Facility  the  information  which  defines  a  table  is 
centralized,  whereas  in  the  Information  Hetrieval  System  this 
information  is  distributed.  In  the  online  database  each  table  is 
represented  by  a  data  structure  called  a  table  header.  The  header 
consists  of  a  fixed  part  which  contains  several  items,  including  a 
definition  of  entry  name  shapes  and  the  starting  address  and  length  of  a 
block  of  entry  slots.  All  slots  are  the  same  size.  Some  slots  are 
marked  as  not  in  use;  the  rest  define  table  entries.  A  slot  in  use 
contains  the  following  information: 

1.  The  value  for  each  parameter  in  the  entry  name. 

2.  The  entry  external  locks. 

3.  A  pointer  to  the  body. 

The  original  design  was  based  on  the  idea  that  a  Works  Manager 
process  would  be  given  the  address  of  an  entry  body,  providing  Works 
Manager  processes  with  the  most  direct  access  possible  to  table  entries. 
.However,  implementers  of  Works  Manager  procedures  felt  that  this  was  too 
da.ngerous.  All  access  is  now  through  copies,  just  as  it  was  in  the 
Information  Retrieval  System.  This  requires  an  extra  block  transfer 
operation  for  most  table  accesses.  This  consunes  on  the  order  of  an 
extra  millisecond  of  CPU  time,  and  is  not  really  a  substantial 
contribution  to  system  overhead. 


( 
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Version  i  of  tMe  Works  ''anaper  Table  Facility  was  first  used  in  a 
..'orks  '’anaper  sonetire  in  July  of  lyTb.  This  initial  version  put  -the 
Active  User  tntries  and  the  User  Identification  Lntries  online  and  left 
all  other  tables  in  the  information  Ketrievol  ;iyste:;  databases,  ll.ese 
two  tables  ere  referenced  in  almost  every  *.orks  Vanaper  Frocecurc  call. 
Fipure  3  shows  clearly  that  even  this  snail  change  produced  a 
'  ubstantial  ir.provercnt.  This  rore-or-less  served  as  an  acceptance  test 
of  the  overall  concept,  and  soon  thereafter  all  other  Works  v.anaper 
tables  except  the  :.SW  File  Catalopue  were  put  online.  The  final 
fipures,  given  in  the  right  hand  column,  were  just  about  wt.at  we  had 
hoped  tc  see,  given  our  PFSTAT  runs  of  1977.  As  you  can  see, 
routines  which  do  not  access  ttie  hSW  File  Uyster.  take  about  one  third  ol 
a  second,  while  routines  which  do  access  the  file  systen  take  about  i 
second.  We  have  used  HFSTAT  to  analyze  the  dilference  between  uEf  and 
PUT  which,  on  the  surface  at  least,  should  take  al!..ost  exactly  tne  same 
amount  of  tine.  We  have  found  a  minor  problem  in  the  interface  between 
the  Information  Retrieval  System  and  the  Works  ''anaper.  When  tnis 
problem  is  corrected,  wc  expect  that  ULT  and  PUT  will  both  take  about 
dOC  milliseconds. 


I 

i 


I 
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Version  '  of  the  Aorks  Vatiager  Table  Kucllity  was  released 
for  systen.  testing,  late  in  1^70.  Version  <?,  released  for  testing  In 
October  of  I97ti.  Implcticntod  those  features  of  the  design  which  were 
deferred,  na-nely  clieckpointing  and  overflow  handling,  '.pecifically , 
the  new  features  in  version  L’  ore: 

Overflow  handling  --  when  space  Is  needed  online, 
lenst-reccntly-used  table  bodies  are  dumped  into  an  Information 
Kctrievol  Systen;  database.  The.  item  number  of  the  offline  body 
rather  than  the  item  name  is  stored  in  the  entry  slut,  to  avoid 
on  expensive  keyword  search  when  the  item  Is  liter  accessed. 

?.  Checkpointlni’  --  A  checkpoint  lock  wns  added  to  ensure  that 
while  the  Checkpointer  is  copying  ttic  database,  no  process  can 
I'o  writing  into  that  datnbase. 

i.  I'ynamlc  allocation  of  n.cmory  block  headers  --  This  reduces  space 
wastage  and  removes  an  arbitrary  restriction  on  the  degree  to 
which  tremory  can  be  broker,  into  separate  blocks. 

<1.  Various  Inprovetrents  to  Increase  robustness  and  ease  ol‘  use,  for 
example,  database  internal  version  numbers  and  support  fur 
bbUTAT. 
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These  per forrranoe  improvcr.ents  were  sufficient,  to  solve  ti.e 
Imneriiate  probler.  Kventually,  however,  anotner  opt  Itrl  zal  ion  puss  will 
be  needed.  It  would  be  prenature  to  say  irucli  about  what  optimizations 
should  be  perforn.ed  next,  ns  we  haven’t  had  a  chance  tc  think  all  that 
deeply  about  the  probleir.  yet.  The  first  step  would  probably  be  to  use 
hiSTAT  to  generate  quantitative  rodels  of  present  systCui  perforrance . 
from  this  we  could  predict  the  overnU  effect  cf  any  specific  chan^.e 
before  we  actually  made  that  chani^e. 

To  give  some  feel  for  the  current  status  of  the  Uorks  Vunaf.er  anw 
to  show  how  we  ml^ht  «o  about  nakiny,  another  opt  ini  zut  ion  pass,  let  us 
examine  flRure  *1,  which  shows  some  I’KSfAT  output  iron,  a  call  on  the 
works  Manager  Procedure  V.'K-LC0Ik. 
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no . 

type 

Ivl 

caller  PC 

called  PC 

r  e  a  1  tir. 

runtn 

dl rea 1 

de  Irun 

dip  ape 

45 

call 

2 

WMMAIIJ 

1  1  466 

M3PKMS 

325027 

56478 

905 

0 

0 

0 

«0 

retn 

2 

WHMAK.’ 

1  1466 

M3PKMS 

325027 

17S3’2 

9I8 

1218  34 

’3 

42 

irerc. 

2 

WHMAIN 

1  1652 

aMLOGS 

37527 

179729 

962 

1417 

44- 

’8 

56 

56 

call 

3 

WMLOGS 

37577 

:  31  HEP 

316067 

179730 

96  3 

1 

1 

0 

57 

rctn 

3 

IVMLOGS 

37577 

LSI hCP 

3 16067 

179746 

979 

16 

16 

0 

56 

call 

3 

'.VMLOGS 

37655 

OPNUTS 

25173’ 

179954 

983 

208 

4 

6 

59 

retn 

3 

.»ML0GS 

37655 

OPNUTS 

2517  3’ 

ie 1 176 

1015 

1222 

32 

80 

60 

coll 

3 

WHLOGS 

4  02  36 

PASIIZN 

342055 

181207 

1018 

31 

3 

1 

01 

retn 

3 

aMLOGS 

4  02  36 

PASHZU 

342055 

I81209 

1020 

2 

2 

0 

02 

call 

3 

h'NLOGS 

40426 

DKTUTS 

274550 

I8I349 

1023 

140 

3 

17 

03 

retn 

3 

aKLCGS 

40426 

L'liTUTS 

274556 

181608 

1029 

259 

6 

6 

04 

call 

3 

wy.LOGS 

40457 

ORTUTS 

274556 

181610 

1031 

2 

2 

0 

05 

rctn 

3 

'a>XUGS 

40457 

OR TUTS 

274556 

181616 

1037 

6 

6 

U 

66 

call 

3 

rt'MLOGo 

40504 

DRTU TS 

274556 

181618 

1039 

2 

0 

67 

rctn 

3 

aMLOGS 

40504 

ORTUTS 

274556 

181624 

1045 

6 

6 

0 

06 

call 

3 

aKLOGS 

4053  1 

ORTUTS 

274556 

181626 

1047 

2 

2 

0 

0  9 

retn 

3 

A.'tLCGS 

4053  » 

ORTUTS 

274556 

181630 

105  1 

4 

4 

0 

fO 

call 

3 

wNLOGO 

4  0556 

ORTUTS 

274556 

181632 

1053 

2 

2 

0 

71 

retn 

3 

aMLOGS 

40556 

ORTUTS 

274556 

181639 

1060 

7 

7 

0 

72 

call 

3 

^MLOGS 

40603 

ORTUTS 

274556 

1b 164  1 

1062 

2 

2 

0 

73 

retn 

3 

wMLOGS 

40603 

ORTUTS 

274556 

181647 

1068 

6 

6 

0 

74 

call 

3 

aMLOGS 

40630 

DRTUrS 

274556 

181649 

1070 

2 

id 

0 

75 

retn 

3 

aMLOGS 

40630 

ORTUTS 

274556 

181652 

1073 

3 

3 

0 

7  6 

call 

3 

aMLOGS 

40655 

ORTUTS 

274556 

18 1654 

1075 

2 

0 

77 

re  tn 

3 

A M LOGS 

40655 

ORTUTS 

274556 

I81751 

1  159 

97 

84 

0 

76 

call 

3 

V.MLOGS 

40702 

ORTUTS 

274556 

I8I753 

116  1 

2 

•2 

0 

79 

retn 

3 

WKLOGS 

40702 

ORTUTS 

274556 

181756 

1 1  64 

3 

3 

0 

30 

call 

3 

aMLOGS 

40726 

ORTUTS 

274057 

181758 

1166 

2 

2 

0 

3  1 

retn 

3 

aMLOGS 

4  0726 

ORTUTS 

274057 

181760 

1  168 

2 

n 

C. 

0 

62 

call 

3 

aMLOGS 

4  0751 

L'TBSbS 

177621 

181871 

1171 

1 1  1 

3 

5 

03 

retn 

3 

WMLOGS 

40751 

OTUSOS 

177621 

182122 

1  176 

251 

5 

10 

64 

call 

3 

aMLOGS 

40766 

'WMEUTS 

254640 

182124 

1  178 

2 

2 

0 

65 

rctn 

3 

WMLOGS 

40766 

WMLUTS 

254640 

182332 

1191 

208 

13 

10 

30 

call 

3 

WMLOGS 

41016 

ORTUTS 

276304 

182351 

1193 

19 

2 

1 

37 

retn 

3 

WMLOGS 

41016 

ORTUTS 

276304 

182353 

1  195 

2 

2 

0 

36 

call 

3 

WMLOGS 

4  1045 

ORTUTS 

274057 

182355 

1  197 

2 

2 

0 

o9 

rctn 

3 

WMLOGS 

4  1  045 

ORTUTS 

274057 

182357 

1199 

2 

2 

0 

90 
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3 

WMLOGS 

41071 

OPNUTS 

253330 

182405 

1202 

4b 

3 

3 

91 

retn 

3 

WMLOGS 

4  1071 

OPNUTS 

253330 

182029 

1340 

224 

’38 

8 

92 

call 

3 

..'MLOGS 

41110 

WMUTIL 

7  325  1 

182675 

1344 

46 

4 

3 

9  3 

retn 

3 

WMLOGS 

41110 

WMUTIL 

73251 

18276  1 

1  366 

80 

22 

10 

94 

call 

3 

WMLOGS 

41117 

OCPMCil 

342625 

182763 

1368 

cl 

2 

0 

95 

retn 

3 

WMLOGS 

4  1117 

LCPMCil 

342025 

182849 

1373 

86 

5 

3 

90 

call 

3 

WMLOGS 

41153 

WMMAIN 

1  1677 

182851 

1  375 

2 

2 

0 

97 

retn 

3 

•WMLOGS 

41153 

WMMAIN 

1  1077 

185436 

1434 

2585 

59 

35 

9  3 
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3 

■WMLOGS 

41167 

ORTUrS 

276304 

185541 

1439 

105 

5 

14 

99 

retn 

3 

WMLOGS 

41167 

ORTUTS 

276304 

185543 

1441 

2 

2 

0 

100 

retn 

2 

WMMAIN 

11652 

aMLOGS 

37527 

185544 

1442 

1 

1 

0 

113 

ncr  p 

2 

WKiMAIN 

1  1466 

M3PRMS 

325027 

186150 

1470 

6  12 

28- 

26 

30 

1-  if.'  3 
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i\  detailed  description  of  licw  PFoTaT  works  is  beyonc  ttie  scope  of 
this  docur.ent,  but  briefly  PrbTAT  ;..an ipwlates  the  lit'FL  control  stack  in 
order  to  take  samples  on  selected  subroutine  calls  anu  returns.  It  will 
snr^ple  all  calls  down  to  some  spccifiec  level  in  tlie  runtime  call  tree, 
below  this  level  ITUTnT  is  disableo,  so  there  is  no  overhoau  on 
low-levcl  calls.  The  i’lobal  sampling  level  can  be  i.anipulated  locally 
by  specifying  that  certain  calls  are  to  be  maf.nifieo  or  pruned.  In  this 
particular  case,  the  p.lobal  level  is  and  the  call  fror.  the  .^orks 
.Mancp.er  main  proprar.,  to  the  l0(,in  subroutine  has  been  flaipcc 

for  mapn  if  ication.  The  listinc  shows  real  (or  wall  clock)  time,  rantii’e 
(or  CPU  time),  and  paRinp  time,  all  in  ni  1 1  i secofids.  liie  full  listitJi 
for  this  run  is  quite  a  bit  larper.  wC  have  usee  rrUiATVs  forrattec  dJisp 
facility  to  print  only  a  portion  of  tiiat  listic.i,.  In  addition,  we  hovc 
t.;erpod  unintcrcstinp  sequences  of  nodes.  ih.is  merp.er  is  done  b>  the 
for;:, alter,  after  the  samples  arc  taken. 

For  example,  consider  samples  7b  anc  77  ol  fipure  j.  odr.,''le 
70  was  taken  when  the  korks  Manager  calleo  a  subroutine.  Hie  call  was 
from  .address  doobb  wliich  is  in  module  wIILdGS.  It;  fact,  we  know  that  it  is 
in  tlie  -log I routine.  ke  could  use  the  bCPL  deLu,;i;cr,  fil.'LT,  to  print 
out  the  line  of  BCPL  code  that  made  the  call.  The  subroutine  beiti,; 
called  is  at  address  £7'ib‘j6  in  nodule  JRTUIG.  This  subroutine  turns 
out  to  be  a  utility  routine  which  copies  table  elements.  Ay.ain,  we 
could  exar.ine  the  subroutine  with  liL'L'T,  t;iven  the  address.  Uampilc 
70  was  taken  loHOb**  seconds  of  real  time  after  PP'STAF  was  enabled  and 
the  sonplinr  began.  Ly  this  tiir.e  HC7b  scconcs  of  CPU  tine  had  been  consu 
A  real  tine  of  ’bi.Ob*^  seconds  is  c  :..i  1 1  isceonds  of  real  tine  after  sa;:.ple 
75  was  taken.  During  this  interval,  D  milliseconds  of  CPU  Lii.e  .and  w 
r;i  1 1  i seconds  of  paginp  time  were  consu.icd.  In  other  ..ords,  this  process 
had  complete  use  of  the  CPU  durinp;  the  interval  and  no  pages  had  to  be  rca 
in . 


Sample  77  was  taken  wh.on  this  subroutine  returned.  T..d  real  tire 
was  seconds,  or  97  n,i  1 1  i seconds  later  than  the  real  tii'.e  at  sample 

70.  I>y  r.ow  ,  1.159  seconds  of  CPU  bine  had  been  changed  to  this  recess, 
which  was  8^  milliseconds  more  than  had  beer  changed  at  sample  70.  ;.o 

additional  paging  time  was  changed,  indicatinp  that  the  entire 
subroutine  was  already  in  real  memory. 

we  have  found  PrSTAT  to  give  highly  repeatable  results,  wliicli  are 
not  noticeably  affected  by  system  load,  amount  of  real  systen.  ir.ct.ory,  or 
type  of  scheduler.  The  only  real  problem  is  t!iat  tlie  sampling  procedure 
itself  takes,  on  the  average,  just  under  2  mi  1  lisecords.  This  averut.e 
appears  to  be  stable.  Unfortunately,  there  is  a  jitter  of  1  to  2 
iri  1 1  i seconds  between  the  tirie  the  san.ples  are  taken  and  the  tine  ILUbk 
increments  its  internal  tables.  f'FblAI  stiould  probably  be  H.odilieo  tc 
subtract  sampling  time  from  the  figures.  For  the  mot.tnt,  however,  blic 
reader  must  mentally  subtract  2  milliseconds  from  e.icii  inc reiicntu  1 
runtime.  For  merged  samples,  however,  a  t,.ultiplc  of  c'  milliseconds  tiust 
be  subtracted.  For  example,  sample  55  is  a  merj  cr  of  9  aw'tles,  so  U- 
:i.i  11  isceonds  r'.ust  he  subtracts;;  instead  of  t. 
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!  i.,urt  '•  siicws  a  l0t;in  call  cn  tlic  ..’orks  «ana^,er.  Tliis  is  the 
latest  I'orks  "anaf.cr,  which  incorporates  version  2  of  the  Works  Mananer 
Table  facility.  The  results  we  saw  previously  were  for  version  1.  As 
saTple  *Jfc  ccinonstrates,  the  Works  Vanaper  spends  11  tuilliseconds  of  CFu 
time  waiting  for  and  then  receiving  the  procedure  call  message.  All  the 
tires  in  figure  4  arc  for  this  fork  only.  The  11  milliseconds  is  a  bit 
puzzling,  as  the  only  process  which  does  anything  significant  during 
this  call  is  the  .XHU  fork.  In  any  case,  the  Works  Kanager  then  spends 
44  minus  ib  or  tb  milliseconds  verifying  that  this  is  a  valid  procedure 
call  mcssaic  and  deciding  to  call  the  Login  procedure  in  ti.odule  WMLOGS 
(sample  b5).  Th.e  Login  routine  then  spends  14  milliseconds  converting 
the  rressage  into  internal  representation  and  simultaneously  verifying 
that  the  rcssage  has  the  correct  form  for  a  login  request  (sample  ‘jY). 

Login  next  retrieves  the  node  entry  corresponding  to  the  project  and 
node  name  given  in  the  message,  consun\ing  3(J  milliseconds  in  the  process 
(sample  bS).  It  appears  that  a  lot  of  this  time  is  due  to  the  very 
simple  .and  somewhat  inefficient  entry  name  matching  algonthtr  in  the 
..orks  '■'anager  Table  Facility.  Next,  the  Login  routine  builds  an  Active 
User  Entry  by  copying  elements  from  the  node  entry.  The  only  really 
expensive  operation  here  is  milliseconds  to  copy  the  list  of  tool 
rights  (sample  77).  At  sample  bb,  Login  takes  11  milliseconcs  to  create 
the  new  Active  User  Entry.  Login  then  puts  the  updated  node  entry  back 
into  the  online  database,  consur.ing  an  alarming  13b  milliseconos  in  the 
process  (sample  91).  Login  then  spends  20  milliseconds  checking  the 
User  Identification  Lntry  for  messages  (sample  93).  Finally,  Login 
takes  57  milliseconds  to  format  a  reply  message  and  send  it  (sample  97). 
After  Login  returns  to  the  Works  Vanager  main  program,  the  main  program 
clrost  itimediately  starts  tc  wait  for  another  proceuurc  call  message 
(sample  113). 

In  suiiir.ary.  Login  took  a  total  of  429  i..i  11  iseconds,  or  bO 
milliseconds  more  than  it  did  when  it  was  tested  in  August  of  1978  with 
version  1  of  the  Works  Vanager  Table  Facility.  A  comparison  with  the 
detailed  timings  for  the  version  l  test  shows  that  in  the  new  timings 
the  only  significant  differences  were  that  copying  the  list  of  tool 
rights  took  37  more  milliseconds  and  putting  tlie  node  back  took  43  more 
milliseconds.  The  extra  37  milliseconds  for  the  copy  is  easy  to 
explain;  the  node  logged  into  when  testing  version  2  had  about  twice  as 
many  tool  rights  as  the  one  used  when  testing  version  1.  The  43 
•billiseconds  for  updating  the  node  was  r.uch  more  disturbing,  so  another 
test  run  was  mace,  this  time  requesting  PFoTAT  to  magnify  the  call  to 
put  the  updated  node  away.  Figure  5  shows  only  that  call. 
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no. 

type 

Ivl 

caller  PC 

called  PC 

realtm 

runtn 

dlreal 

delrun 

dlpage 

90 

call 

3 

WMLOCIS 

41071 

OPNUTS  253330 

103796 

1 110 

0 

0 

0 

91 

call 

D 

OPNUTS 

253356 

OLttUTS 

261002 

103798 

1 1 12 

2 

2 

0 

92 

rctn 

4 

OPUUTS 

253356 

OLRUTS 

261002 

103646 

1117 

50 

5 

0 

93 

call 

4 

OPNUTS 

253‘«66 

KSKUTS 

263533 

103861 

1120 

13 

3 

1 

retn 

4 

OPMUTS 

253466 

MSKUTS  263533 

103863 

1122 

2 

2 

0 

95 

call 

4 

OPNUTS 

253567 

WKTUTS 

270420 

103881 

1125 

16 

3 

1 

96 

retn 

4 

OPNUTS 

253567 

WMTUTS 

270420 

104261 

1249 

360 

124 

4 

97 

call 

4 

OPNUTS 

253636 

tNTUTS 

2^)6557 

104264 

1251 

3 

2 

U 

98 

retn 

4 

OPNUTS 

253636 

EHTUTS 

256557 

104278 

1266 

14 

15 

0 

99 

call 

4 

OPNUTS 

253650 

OLRUTS 

26  1215 

104260 

1266 

2 

2 

0 

100 

retn 

4 

OPNUTS 

253650 

OLRUTS 

261215 

104261 

1269 

1 

1 

0 

101 

retn 

3 

NMLOGS 

41071 

OPNUTS 

253330 

104281 

1269 

0 

0 

0 

Fig.  « 
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The  subroutine  calls  shown  in  figure  are  all  in  WHTPK(j,  a  set  of 
routines  which  lie  between  the  VJorks  Kanager  top  level  routines  on  one 
side  and  the  Works  Manager  Table  Facility  and  the  Information  hetrievai 
System  on  the  other  side.  The  actual  call  to  put  the  node  entry  away  is 
made  by  the  subroutine  called  at  sample  97.  This  call  takes  only 
milliseconds.  The  expensive  call  is  the  one  at  sample  9S,  which 
consumes  122  milliseconds.  This  is  a  call  to  a  garbage  collection 
routine,  and  happens  to  be  quite  unnecessary.  In  other  words,  we  have 
found  a  performance  bug  in  the  Login  routine.  (This  bug  has  since  been 
fixed. ) 

A  brief  explanation  is  in  order  of  what  this  garbage  collection  is  all 
about.  As  far  as  the  Works  Manager  Table  Facility  is  concernod,  a  table 
entry  is  just  a  block  of  arbitrary  numbers.  Actually,  an  entry  body  is 
a  data  structure  called  a  Dynamic  Relocatable  Table.  The  use  of  tlie 
word  "table"  here  is  unfortunate,  and  comes  in  part  from  the  fact  that 
the  Works  Manager  is  implemented  in  layers.  In  any  case,  a  dynamic 
relocatable  table,  here  a  node  entry,  will  grow  in  length  if  nonscalar 
elements  in  it  are  replaced.  The  only  way  at  present  to  retrieve  the 
space  is  to  do  a  garbage  collection,  which  consists  of  copying,  element  by 
element,  the  entire  data  structure.  This  is  what  is  taking  122 
milliseconds.  Here  the  garbage  collect  was  totally  useless,  since  the 
only  element  changed  was  a  scalar. 

This  also  explains  37  of  the  R3  extra  milliseconds  that  this 
garbage  collection  took  with  version  2  compared  to  version  1.  A  garbage 
collect  is  a  copy  of  all  elements,  including  the  list  of  tool  rights.  As 
we  saw,  the  longer  list  in  the  new  node  took  37  extra  milliseconds  to 
copy. 


.  / 


y 


A-IA 


SUMMARY  CF  CPU  TIME  COflSU>lED  It,  WK-LOUIN 


A  total  of  it29  fc:illis€contJs  was  expended  in  all. 

^  “f  “>“1.  “•»  »Pent  r.trl.vl„g  t.bl. 

and  decodinc^raeaUp.es^**^  total,  aaa  apent  sending,  receiving, 

“•  e;;rT‘i.““5r?n“n  tan,. 

regaining  119,  at  lealt  bI  '»  >  ““8.  Of  the 

rights.  iKilliseconds  was  due  to  the  list  of  tool 
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Fif.ure  5  summarizes  what  we  nov;  know  about  the  Works  Manager  Login 

call : 


A  total  of  milliseconds  of  CPU  time  was  consixried  in  all.  Only 
77  milliseconds,  or  18%  of  the  total,  was  spent  retrieving  tables.  Thus 
few  of  the  improvements  we  suggested  earlier  for  the  Works  Manager  Table 
Facility  would  have  any  noticeable  effect,  at  least  for  a  Login  call. 

The  only  potential  problem  is  that  we  might  get  into  trouble  retrieving 
nodes  if  there  got  to  be  a  lot  of  them,  so  we  should  probably  do 
something  about  entry  name  searches  before  HbW  grows  too  much  larger. 

Again,  only  ill  milliseconds  or  26i  of  the  total  was  spent  sending, 
receiving,  and  decoding  MSG  messages.  This  is  a  relief,  as  some  of  the 
utility  routines  which  handle  MSG  messages  are  large  and  look 
threatening.  A  whopping  241  milliseconds,  or  bOJ  of  the  total,  was 
spent  manipulating  table  entries.  Almost  half  of  this  time  is  due  to  a 
bug,  an  unnecessary  garbage  collection.  Furthermore,  695  of  the  remaining 
1'T  milliseconds  was  spent  copying  the  list  of  tool  riglits. 

Thus,  if  we  got  rid  of  the  garbage  collect  and  found  some  way  to 
get  rid  of  the  time  spent  copying  the  list  of  tool  rights  then  Login 
would  take  about  225  milliseconds  --  just  over  half  as  much  tiiT.e  as  it 
does  now. 

This  is  only  an  example  of  how  we  would  go  about  planning  for 
further  reductions  in  Works  Manager  runtime,  before  we  actually  do  any 
reducing,  wc  should  perform  the  same  type  of  analysis  on  all  major  Works 
Manager  procedure  calls.  We  should  then  use  those  results,  plus  the 
results  of  LiBK's  system  tests  and  Manni  Chandi's  higher  level  model,  to 
plan  which  changes  will  be  most  cost-ef f ect ive  in  tern^  of  overall  US'n' 
performance. 
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Batch  Job  Package  -  External  Specification 

This  document  is  written  in  the  style  of  Appendix  1  of  "The 
Foreman:  Providing  the  Program  Execution  Environment  for  the  National 
Software  Works"  by  R.  Schantz  and  R.  Millstein,  which  is  used  herein 
as  a  reference.  It  describes  functions  of  the  Batch  Job  Package 
(which  may  be  invoked  by  WMO)  and  a  function  of  WMO  (which  may 
optionally  be  invoked  by  BJP).  All  invocation  messages  are 
generically  addressed;  requested  replies  on  the  other  hand  are 
specifically  addressed  to  the  invoking  process. 

The  Batch  Job  Package  (BJP)  on  a  Batch  Tool  Bearing  host 
(BTBH)  cooperates  with  Works  Manager  Operators  (WMO's)  to  control  the 
execution  of  NSW  batch  jobs.  Once  an  USW  user  has  submitted  a  job,  it  is 
the  responsibility  of  a  WMO  and  BJP  to  execute  the  job  and  to  produce 
status  reports  as  required. 

WMO  serves  a  role  analogous  to  that  of  the  Foreman  in  that 
WMO  keeps  the  Local  Name  Dictionary  (LND)  for  the  job  and  supervises 
required  file  prestaging  and  delivery  operations.  BJP  includes  all 
functions  (exclusive  of  file  transfer  and  translation)  required  at  BTBH 
to  accomplish  batch  job  execution.  Its  conversational  partners  are  the 
WKO's  through  which  jobs  are  submitted. 

BJP's  data  base  is  concerned  with  the  management  of  all  NSW 
jobs  in  progress  at  BTBH.  Associated  with  each  job  is  the  generic  process 
address  of  the  WMO  which  submitted  it,  and  is  therefore  its 
conversational  partner  with  respect  to  that  job  throughout  the 
duration  of  the  job.  The  following  discussion  relates  to  messages 
for  individual  jobs  and  therefore  is  cast  in  terms  of  communication 
between  a  single  WMO  and  BJP.  It  is  assumed  that  the  process  address 
of  the  submitting  WMO  is  recorded  in  BJP's  data  base,  and  is  used  for 
addressing  messages  from  BJP.  Should  any  WMO  disappear  (because  of 
an  error)  an  as  yet  unspecified  restart  sequence  must  be  executed. 

Job  naming  considerations 

Each  WMO  maintains  a  queue  of  up  to  256  jobs  in  progress. 

When  an  NSW  user  submits  a  job,  the  WM  to  which  the  user  is  assigned 
contacts  a  WMO  with  the  job  request.  WMO  assigns  the  job  to  an  unused 
location  in  the  queue,  an  assignment  which  remains  intact  throughout 
the  stages  of  job  execution. 

Every  time  WMO  is  "cold-started"  all  entries  in  its  queue 
are  marked  as  free.  All  such  cold  starts  are  explicated 
by  maintaining  a  WMO  cycle  number  which  is  initially  1  and  incremented 
(except  that  the  successor  to  16383  is  1)  each  time  such  a  cold 
start  occurs.  NSW's  name  for  a  job  is  a  triple  of  indices: 
the  WMO  host  number,  that  WMO's  cycle  number,  and  the  position  within 
that  cycle's  job  queue.  Every  time  WMO  contacts  BJP  regarding  an  NSW  job, 
its  NSW  name  is  Included  in  the  message. 
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BJP  may  «iish  to  create  local  names  for  hSu  jobs.  This  optional 
character  string  is  returned  by  bJP  when  a  job  is  acce^.tec.  WMO  will 
retain  this  name  in  its  job  queue  and  supply  it  alonji  with  the 
NSW  name  in  messages  relating  to  the  job. 

If  hJP  can  supply  cycle  and  number,  local  name  is  optional; 
if  it  can  supply  local  name,  cycle  and  number  are  optional. 

Complete  information  is  preferred  for  error  checking. 

bJP  must  respond  to  a  small  nuiuber  of  messages  from  WMO, 
and  initiate  a  message  of  its  own  -  as  follows: 

.  WMC*s  initial  contact  with  bJP  is  to  request  allocation 
of  a  workspace  and  assignment  of  local  names  for  job 
output  files.  Arguments  include  batch  tool  name, 
cost  estimate  (in  machine  dependent  units),  priority, 
and  a  list  of  size  estimates  for  job  output  files. 

Response  includes  the  workspace  name  («3,  a  character 
string),  a  status  indicating  whether  the  job  was  accepted 
or  rejected,  and  a  list  of  local  names  assigned  to 
output  files. 

.  Once  file  pre-staging  is  coir,plete,  WMO  will  request  oJP 
to  submit  a  given  local  file  for  batch  execution. 

The  response  is  job  status. 

.  BJP  will  notify  WMO  when  a  job  is  done.  This  message 
includes  time  and  charge  information. 

.  WMO  will  request  BJP  to  delete  a  job. 

bJP's  response  confirms  the  deletion  and  reports 
any  abnormal  conditions. 

.  WMO  may  inquire  as  to  the  status  of  a  job  as  well  as 
a  Boolean  which  is  true  if  the  job  is  tc  be 
continued  or  false  if  it  is  to  be  cancelled.  bJP’s 
response  includes  a  user-or iented  character  string  as  well 
as  the  job's  status. 


AH  Job  execution  sequences  begin  with  an  allocation  request; 


BJP-ALLOCATEJOB  ( tool-id-1 ist ,  accounting-list,  tool-depenaent- 

parameter-list) 

->  tool-id-list,  status-list,  workspace-descriptor 
where  the  parametric  data  are: 

tool-id-list:  LIST(cycle-no,  tool-instance-id,  local-name) 

cycle-no; 

index,  with  0  denoting  "unknown." 
tool- instance- id : 

integer,  with  0  denoting  "unknown." 
local-name: 

charstr;  if  length  is  0  then  is  "unknown." 
accounting-list: 

see  reference  pages  A1-3  and  A1-d. 
tool-dependent-par  arret  er-1  ist:  LIST(n-charstrs) 
status-list:  LIST(status-code,  qcan-proceed,  status-report) 


status-code: 
index 


=0-> 

not  found 

=  1-> 

allocated 

=2-> 

scheduled 

=  3-> 

running 

zM-> 

halted 

=5-> 

deleted 

qcan-proceed: 

boolean  =true  -> 
=  f  alse-> 


can  proceed 
cancellation  requested 


status-report: 

charstr;  if  length  =  0  then  is  null  report 


workspace-descriptor: 

see  reference  page  A1-2 
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Parameter  usage 
Tool-id-list: 

Every  message  between  WHO  and  bJP  includes  tool-id-list.  This 
applies  to  both  invocations  and  replies.  When  WHO  sends  a 
tool-id-list  it  always  includes  cycle-no  and  tool-instance-id,  and 
local-name  once  WMO  learns  of  it.  Most  messages  from  bJP  to  WHO  are 
specific  replies.  The  returned  tool-id-list  may  include  cycle-no  and 
tool-instance-id  if  desired;  WMO  will  perform  consistency  checks  if 
they  are  returned.  Should  a  local  name  be  returned,  then  it  will  be 
recorded  and  supplied  to  BJP  in  all  subsequent  messages  from  wMO  to 
BJP.  This  name  must  be  unique  for  that  BJP.  Local-names  cannot 
be  reused  by  a  BJP.  Different  BJPs  may  use  the  same  name. 

Should  BJP  return  the  local-name  once  it  has  been  recorded, 
consistency  checks  will  be  made  on  it,  as  well. 

All  other  parameters  are  as  described  in  the  reference 
document,  except  status-list,  which  is  straightforward. 

All  subsequent  messages  from  BJP  to  WMO  report  on  the  state  of 
particular  job.  EJP's  response  requirements  seem  to  be  as  follows: 
given  a  message  about  a  job,  inform  WHO  of  the  new  state  of  the  job. 
This  yields  a  simple  response  algorithm  for  BJP,  as  BJP  learns  about 
job  completion  via  polling  or  a  message  from  BTBH's  operating  system; 
and  the  required  response  is  also  a  status  report  to  WHO. 

BJP-QUEKY  (tool-id-list,  qproceed) 

->  tool-id-list,  status-list,  accounting-list 

where  the  additional  parameter  is 

qproceed: 

boolean  =true->  can  proceed 

=false->  cancellation  requested 


i'<ote  that  this  slnplifies  UXO  processing,  as  Status  is  the 
only  variable  of  interest  to  it  -  the  remainder  of  the  [message  aata 
is  sir.iply  recorded.  If  Local-nat.;e  is  supplied,  then  Cycle-no 
and  Job-no  are  optional,  ana  vice-versa. 


DJP-EKDJCB  (tool-id-list) 

->  tool-id-list,  status-list,  accounting-list 

uote  that  this  is  not  the  sane  as  hJPiJUEKY(- ,F)  since  it  is  useu  by 
Vi'V.G  after  file  delivery  and  recording  of  final  job  charges  is  complete 


Kunctlons  implemented  within  WHO 


WM0-J013H ALTEl?  ( tool-id-1  i st ,  status-list,  acounting-list) 


This  is  the  only  function  in  WHO  in^^oked  by  bJH.  It 
requests  WMO  to  initiate  Job  delivery  operations.  It  is  an  optional 
feature;  alternatively,  WMO  can  issue  bJP-'.,ULKy  periodically 
until  it  finds  tf\e  status  code  correspondlnj;  to  Job  halted. 

If  implemented,  then  eitlier  local-name  or  tuol-instance-iu  must 
be  supplied  (unless  UJP  limits  itself  to  a  sinRle  USW  Job  at  any 
given  time). 


Optional  functions  whicli  can  be  irp  le.. cntoo  witlun  any  Latch 
Job  Facbapc 


L'JF-LTARTJCL'  ( t ool -i d-1  i st ,  workspace-descriptor,  filcnar.e) 

->  t col-id-list,  status- list,  a ccount inp-l ist 

where  the  additional  parar.ctor  is  filenap.c: 
see  reference  pape 

This  function  is  optional.  The  nap.ee  file  is  taken  as  t.hc 
job  control  statererts  for  the  piven  Job.  Alternatively,  il  !‘JF*s 
file  packaf.c  can  initiate  job  execution  wl'.en  a  file  of  a  yiven  Kind 
is  created  (such  as  (CF  =  < jobnare> ,  L'C  =  1!()  r.a:<.cs  in  CLC  oCJFfc. 
systers),  ilJP-STAKTJCb  need  not  be  supported. 

WMO  will  be  driven  by  a  table  taken  J ror  the  tool  descriptor 
to  sequence  the  procedures  which  actually  get  invoKcu.  Inis 
allows  supporting  a  variety  of  batch  hosts;  c.y.  -  hosts  where 
filespace  rust  bo  reservee  and  those  wb.erc  it  doesn't,  and 
hosts  which  do  nMO-JObll ALTLD  and  those  tliat  don't. 
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Appendix  C 
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Transportability 

a'c  define  transportability  as  follows:  an  iJ3w  component/uti lity 
is  transportable  if  it  can  be  copied  from  one  host  to  another  and  be 
immediately  executed  with  no  patching  or  modification  beyond  naming 
the  transported  executable  file  to  suit  the  ne«  system's  naming 
conventions.  This  definition  covers  transport  between  botli  similar 
(Ttl.bX  -  TCNEX,  TUPSiiO  -  TOPSlfO)  and  dissimilar  ( Tt;HEX-TUPii20-Tt:iJEX ) 
sy  stems . 


All  M>W  components  and  utilities  riaintained  by  MCA  liad  to  be 
Modified  to  achieve  transportability,  we  identified  two  classes  of 
problems  to  be  solved:  (l).  Differences  between  TLKEX  and  TuPi>20  at 
the  interface  to  the  operating  system,  i.c.  different  behavior  and/or 
naming  of  JSYDes;  (2).  Tailoring  -  Configuration  -  specific  data 
built  into  components  that  had  to  be  patched  for  each  llSW  system  on 
each  USW  host,  e.g.  the  PSW  filespace  name  compiled  into  the  File 
Package.  The  first  problem  we  solved  by  encapsulating  most  JSYS 
calls  in  three  JSYS  routine  packages  with  system  insensitive  call 
interfaces;  we  solved  the  second  by  providing  a  Global  Tailoring  File 
which  captures  configuration  information  for  each  KSW  system  on  each 
host . 


J3YS  encapsulation 

i.e  encapsulated  JSYSes  by  dividing  the  JGYSes  used  throughout 
our  I’iSW  components  into  three  groups,  and  providing  three  corresponding 
paeJiages  of  ECPL  routines  to  access  the  JbYiies  indirectly.  The  groups 
are: 


(1) .  File  handling,  day/time,  and  error  handling 

JGYSes 

(2) .  PhAP  and  related  file  mapping  JSYGes  packaged  to  do 

page-oriented  reading  and  writing  to  files. 

(3) .  Utilities  to  access  the  AhPANET  system  tables 

maintained  on  both  TEKEX  and  TUPS20  hosts. 

The  call  forrot  for  all  the  routines  in  these  packages  is 
consistent  and  uniform;  success  or  failure  of  each  call  is  signaled  in 
a  uniform  way,  and  a  system-produced  error  message  is  returned  whenever 
available.  Obtaining  the  latter  typically  requires  two  further  JOYS 
calls  (GETEK  and  EKSTR)  which  formerly  had  to  be  placed  in  line;  thus 
much  of  our  code  which  calls  the  encapsulations  has  been  reduced  in  size, 
and  is  more  readable.  When  required,  checking  for  TLNEX  VS.  TOPS20 
is  done  at  run  time.  The  contents  and  characteristics  of  the 
three  encapsulation  packages  are: 
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Common  JSYS  package  (JSYSUT  in  bCPPKG 

.  JSYSes  which  have  different  names  and  arguments  on  10X 
and  TUPS20,  encapsulated  so  that  arguments  produce 
results  identical  to  execution  on  TtNEX: 

STOIR/KCDIR  (directorySTr ing  to  directory  nuiuber) 

CNDIR/ACCES  (connect  to  directory) 

GTAD  (get  time  and  day.  T0PS20  internal 

time  converted  to  TENEX) 

Ot)TIM  (output  time.  TENEX  internal  converted 

to  T0PS20  if  required) 

.  JSYSes  with  identical  names,  but  requiring  differing 
defaults  or  argument  formatting  to  execute  identically  on 
TENEX  and  TOPS20. 

GTJEN  (Get  JEN.  TOPS20  specific  code  to 

handle  structure  references) 

PE'AP  (hap  fork/file  page.  To  handle 

differences  in  unmapping) 

I^ELDF  (expunge  directory.  Different 

argument  sequence) 

.  JSYSes  with  identical  behavior  on  both  systems: 

Encapsulation  provides  consistent  defaulting  and 
error  handling. 


CPENF 

(Open  file) 

:losf 

(Close  file) 

CLZFF 

(Close  fork's  files) 

KLJFlJ 

(Release  JFN) 

UELF 

(Delete  file) 

SFPTK 

(Set  file  pointer) 

J  K  i«  S 
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UI.'AKF 

SIZtF 

130UT 

i:it;,scuT 


(JFIJ  tc  strini,) 
(iietiarne  file) 

( Si ze  of  file) 
(byte  in,  out) 
(Strinp  in,  out) 


2,  FKAP  I/O  package  (PKPUTS  in  UCPPKO) 

.  A  group  of  five  routines  which  use  the  PMAP  JSYS  to  cio 
page-at-a-t ime  I/O  between  a  file  and  core  t.eir.ory. 
bncapsulates  the  difference  bcwcen  TttiFX  and  ioPSb'o  Pi.,iP, 
and  isolates  the  user  from  the  subtleties  of  using  PhnP 
to  create  file  pages.  The  operations  supported  one: 

Getting  a  file  page  into  core. 

Putting  a  core  page  into  a  file. 

Locating  the  next  file  page. 

Getting  and  modifying  some  parts 

of  a  File  Lescriptor  block. 


3.  Host  tables  package  (HOSTUT) 

.  A  group  of  routines,  which  support  tlie  following  activities; 

One-time  read  of  host  tables  into  core. 

heturn  host  (arpanet)  nun.ber  given  nanie. 

Heturn  host  name  and  nicknames  given  numter. 

Keturn  host  operating  system  type  given  number. 

These  packages  were  thorougtily  tested  witli  several  stra  ic.htforward 
test  drivers;  all  NSW  components  were  tfian  laundered  tc  remove  direct 
in-line  JSYS  calls.  All  utilities  which  did  not  require  the  Global 
Tailoring  File  facility  then  became  itiimediately  transportable. 
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Global  Tailoring  File  Facility 

This  facility  consists  of  the  following  elements: 

(1)  The  global  tailoring  file,  a  fixed  format  ASCII 
file  encoded  so  as  to  be  untypeable.  This  file  must 
exist  in  the  LOGIN  directory  for  each  NSW  system  on 
each  NSW  host;  it  contains  the  following  entries: 

NSW  filespace  name 

'WMO  Job  Queue  File  name 

Checkpoint  directory  name 

External  process  MSG  call  timeout  value 

Component  logging  flag 

WMO  sleep  interval  (between  each  queue  in  job  queue  file) 

(2)  A  utility  (GLBTAL)  which  can  create  or  list  the  contents 
of  this  file,  and  replace  individual  entries. 

(3)  A  package  of  utilities  (GLbUTS  in  NSUPXG)  which  allow 
components  to  read  entries  from  the  file). 

(A)  A  set  of  descriptors  for  the  file  entries  compiled  into 
GLBTAL  and  all  accessors  of  the  file. 

Use  of  this  facility  requires  the  following  discipline;  The 
Global  Tailoring  File  must  exist  in  each  TLNEX/T0PS20  NSW  LoGIN 
directory,  and  its  format  must  be  compatible  with  the  coir.piled-in 
descriptors  for  all  accessors  in  that  directory.  Keformatting  the 
file  requires  re-compilation,  reloading  and  re-distribution  of  all  accessors. 
This  has  not  proved  burdensome  to  date,  as  the  file  changes  format 
rarely . 


In  addition  to  providing  configuration  information  (directory 
and  file  names),  the  facility  supports  a  crude  facility  for  turning  test 
VS.  user  systemis  by  allowing  manipulation  of  parameters  directly 
related  to  performance  (timeout  values,  inhibition  of  logging. 

The  components  and  utilities  using  this  facility  are: 

Works  Manager 

File  Package 

works  Manager  Operator 

Checkpointer 

WMO  Utility 


MISSION 

of 

Rome  Air  Development  Center 

HAOC  pians  and  exzciU&A  xtizoAck,  development,  -teit  and 
detected  acquuAfXion  pA.og^am  ■in  6uppcA^  Command,  ContAot 
Communications  and  InteZtigzncz  (C^l)  activities.  Technical 
and  engtnee^ng  support  Mithin  ancas  o^  technical  competence 
Is  provided  to  ESV  PAogaam  O^^ices  (PCs)  and  otheA  ESP 
elements.  The  principal  technical  mission  oAeas  axe 
communications,  electxomagnetic  guidance  and  contxol,  sux- 
veillance  gxound  and  aexospace  objects,  intelligence  data 
coltection  and  handling,  information  system  technology, 
ionosphexic  pxopagation,  solid  state  sciences,  miexomve 
physics  and  elec^onic  xetiability,  maintainability  and 
compatibility. 


